Cerebras Systems Unveils World’s Fastest AI Chip with Whopping 4

Move Over Groq, Cerebras Now Has The World’s Fastest AI Inference

Cerebras Systems Unveils World’s Fastest AI Chip with Whopping 4

Cerebras has finally opened access to its Wafer-Scale Engine (WSE) and it’s achieving 1,800 tokens per second while inferencing the Llama 3.1 8B model. As for the larger Llama 3.1 70B model, Cerebras clocks up to 450 tokens per second. Till now, Groq was the fastest AI inference provider, but Cerebras has now taken that crown.

Cerebras has developed its own wafer-scale processor that integrates close to 900,000 AI-optimized cores and packs 44GB of on-chip memory (SRAM). As a result, the AI model is directly stored on the chipset itself, unlocking groundbreaking bandwidth. Not to mention, Cerebras is running Meta’s full 16-bit precision weights meaning there is no compromise on accuracy.

I did test Cerebras’ claim and it generated a response at a breakneck pace. While running the smaller Llama 3.1 8B model, it achieved a speed of 1,830 tokens per second. And on the 70B model, Cerebras managed 446 tokens per second. In comparison, Groq pulled 750 T/s and 250 T/s while running 8B and 70B models, respectively.

Artificial Analysis independently reviewed Cerebras’s WSE engine and found that it does deliver unparalleled speed at AI inference. You can click here to check out Cerebras Inference by yourself.

NYT Strands Today: Hints, Answers & Spangram For August 21
IPhone 16 Pro Vs IPhone 16 Pro Max: Which One Is Right Fit For You?
What Is Depth Strider In Minecraft?

Cerebras Systems Unveils World’s Fastest AI Chip with Whopping 4
Cerebras Systems Unveils World’s Fastest AI Chip with Whopping 4
Cerebras Introduces ‘World’s Fastest AI Chip’ and New AI Server
Cerebras Introduces ‘World’s Fastest AI Chip’ and New AI Server
Cerebras and G42 Unveil World's Largest AI Supercomputer - WinBuzzer
Cerebras and G42 Unveil World's Largest AI Supercomputer - WinBuzzer