Groq presents specialized language processing unit significantly faster than Nvidia's AI accelerators

Groq LPU (Image Source: Groq)

The LPU Inference Engine from Groq is designed to be considerably faster than GPGPUs when processing LLM data. To achieve this, the LPU makes better use of sequential processing and is paired with SRAM instead of DRAM or HBM.

Bogdan Solca, Published 02/28/2024 🇨🇳 🇪🇸 ...

While Nvidia is currently enjoying outstanding profits as it rides the AI wave with the increasing demand for compute GPUs, the market could become more decentralized as more companies step in to provide viable alternative AI processors. We have seen efforts from several companies in this regard, including AMD, d-Matrix, OpenAI and Samsung. It looks like quite a few engineers who helped design Google’s tensor processing unit (TPU) are now involved in independent AI projects that promise to outclass Nvidia’s solutions. Samsung, for instance, recently announced that its new AGI Computing Lab opening in Silicon Valley is led by former Google TPU developer Dr. Woo Dong-hyuk. Another key engineer that helped with the development of the Google TPU is Jonathan Ross who is now the CEO of a new company called Groq. Harnessing the experience accumulated at Google, Ross brings innovation to the AI accelerator market with the world’s first Language Processing Unit (LPU).

Groq's LPU is specifically designed to process large language models (LLMs) and has clear advantages over general purpose GPUs or NPUs. Groq initially developed the Tensor Stream Processor (TSP), which was later rebranded as language processing unit to reflect its increased proficiency at Generative AI tasks based on inference. Since it is focused solely on LLMs, the LPU is much more streamlined than a GPGPU and allows for simplified scheduling hardware with lower latency, sustained throughput and increased efficiency.

Consequently, the LPU reduces the amount of time per word calculated, and sequences of text can be generated much faster. Another key improvement is that the LPU eliminates the need for expensive memory (HBM), as it uses only 230 MB SRAM per chip with 80 TB/s bandwidth, making it considerably faster than traditional GPGPU solutions. Groq’s architecture also supports scalability, as multiple LPUs can be interconnected to provide increased processing power for more complex LLMs.

To demonstrate how much faster the LPU Inference Engine is compared to GPUs, Groq is providing a video comparison of its own chatbot that can switch between the Llama 2 / Mixtral LLMs versus OpenAI’s Chat-GPT. Groq claims that the LLM is generating the text in a fraction of a second and the other 3 ⁄ 4 of the time is spent searching for relevant information.

Source(s)

Groq

via Techradar

"The More You Buy, The More You Save." (Image Source: CNA)

Nvidia wins: Team Green becomes world's most valuable publicly traded company 06/19/2024

Nameless AI chip concept render (Source: DALL·E 3-generated image)

Elon Musk unveils plan to acquire $9 billion worth of AI chips from Nvidia by next summer 06/04/2024

New York City's official government chatbot seems to be confused about some laws and regulations. (Image: DALL-E 3 AI-generated image)

New York City government chatbot advises businesses to break laws 04/05/2024

Microsoft is testing a Copilot feature that may become as ubiquitous as the Clipboard. (Source: Microsoft)

Microsoft Copilot integrates more tightly with Windows 11 02/11/2024

Samsung SAINT 2.5D packaging tech (Image Source: kedglobal)

Samsung may produce key components for Nvidia's Blackwell AI accelerators on SAINT packaging tech 12/05/2023

SiFive is welcoming the AI revolution. (Image Source: SiFive)

SiFive launches Performance P870 RISC-V scalable core and Intelligence X390 NPU 10/12/2023

Double the HBM3E bandwidth (Image Source: Samsung)

Samsung and SK Hynix rumored to boost AI accelerator performance with the advent of the HBM4 DRAM standard 09/13/2023

AMD's Ryzen 7040HS series of processors are now official (image via AMD)

AMD Ryzen 7 7040HS Phoenix processors announced with RDNA 3 iGPU and XDNA AI accelerator. 06/14/2023

Read all 1 comments / answer

Loading Comments

Comment on this article

Roborock P10S Pro released with hig...

Tesla Model Y safer than Volvo in t...

Bogdan Solca - Senior Tech Writer - 2378 articles published on Notebookcheck since 2017

I first stepped into the wondrous IT&C world when I was around seven years old. I was instantly fascinated by computerized graphics, whether they were from games or 3D applications like 3D Max. I'm also an avid reader of science fiction, an astrophysics aficionado, and a crypto geek. I started writing PC-related articles for Softpedia and a few blogs back in 2006. I joined the Notebookcheck team in the summer of 2017 and am currently a senior tech writer mostly covering processor, GPU, and laptop news.

contact me via: Facebook

Please share our article, every link counts!