Groq

tboerstad · 2024-02-19T04:44:49 1708317889

This is insanely fast, obviously a game changer over time. You should try the demo!

This seems to be using custom inference-only HW. It makes a ton of sense to use different HW for inference vs training, the requirements are different.

Nvidia, as far as I can tell, is focusing all-in on training and hoping the same HW will be used for inference.

Exciting times!

tome · 2024-02-19T12:33:39 1708346019

Hi there, I work for Groq. That's right. We love graphics processors for training but for inference our language processor (LPU) is by far the fastest and lowest latency. Feel free to ask me anything.

mayli · 2024-02-20T04:23:54 1708403034

What's the scale of hardware behind this demo, in terms of watts, transistors and cost?

artninja1988 · 2024-02-19T15:01:33 1708354893

Are they only available on the cloud? Are you planning on releasing a consumer version?

tome · 2024-02-19T15:22:14 1708356134

Mostly available as a service via cloud API at the moment. The systems themselves are too big for consumers but we will sell systems to corporations.

oneshtein · 2024-02-19T17:46:40 1708364800

Refuses to talk in my native language sometimes. When complies, makes basic errors in spelling and response is much shorter. Is it AI revolt?

gregsadetsky · 2024-02-19T20:01:42 1708372902

conversation has moved to https://news.ycombinator.com/item?id=39428880