Cerebras Shares Soar in Chip IPO
Cerebras Systems' (Nasdaq: CBRS) blockbuster IPO today had Wall Street buzzing even before the NVIDIA rival’s shares started trading on the public market. Investors are clearly responding to market demand for alternatives to NVIDIA’s dominance in AI and GPU drawbacks in inference processing.
An initial price of $185 per share for Cerebras stock was significantly higher than the initial range of $150 to $160, thanks to oversubscribed demand. An initial offering of 30 million Class A shares raised $5.55 billion Wednesday from the underwriters, which would place the company’s valuation at about $40 billion.
As of this writing, Cerebras shares had soared over 65% and were trading above $300 per share.
A handful of major AI players stand to benefit from the Cerebras IPO, including OpenAI, which has over 33 million warrants as well as a multi-year agreement to deploy of Cerebras chips—an arrangement that reportedly has ballooned to be worth over $20 billion to the chipmaker.
According to Bloomberg data, the IPO has bested Arm’s $5.23 billion IPO in 2023, making it the largest semiconductor IPO ever listed in the U.S.
An Eventful Path to IPO
Getting here hasn’t been smooth. In 2024, Cerebras filed an S-1 to go public, but withdrew the plan in October 2025, citing a “procedural shift” toward creating a more updated prospectus. Investors had questioned the original S-1’s revelation that one customer, Abu Dhabi’s G42, held substantial stock warrants for Cerebras and was the chipmaker’s chief customer. A review of the G42 stock deal by the Committee on Foreign Investment in the United States (CFIUS) raised eyebrows as well. (Notably, Cerebras’s April 2026 S-1 cited G42, AWS, OpenAI, and Abu Dhabi’s Mohamed bin Zayed University of Artificial Intelligence as its anchor customers.)
In September 2025, shortly before withdrawing its original S-1, Cerebras raised $1.1 billion in Series G funding at a valuation of $8.1 billion. As part of the announcement, Cerebras revealed that AWS and Meta were key customers, and it claimed to be 20X faster at inference than NVIDIA GPUs. Then in February 2026, the company announced $1 billion in Series H funding at a $23 billion valuation.
The road to IPO had another big twist: Arm and SoftBank reportedly offered to buy Cerebras weeks before the public debut but were turned away.
Big Deals in Inference
Cerebras is leading a charge into inference chips that has become a major market focus. Founded in 2015 by a group of former AMD engineers, including Andrew Feldman, now CEO; Gary Lauterbach (CTO Emeritus, now retired); Michael James (now Chief Architect, Advanced Technology); Sean Lie (now CTO); and Jean-Philippe Fricker (now Chief System Architect), the startup made its name for its unique wafer-style design.
Cerebras's Wafer Scale Engine, or WSE, packs the power of multiple chips onto a single boundary-free “dinner plate.” This is particularly effective in speeding up inference workloads, which require multiple repetitive passes through memory. Since Cerebras eliminates inter-chip gaps that can slow down traditional GPUs, its chip is faster for inference because it does not need to access external memory chips.
In March 2026, AWS and Cerebras announced an agreement to advance AWS’s inference capabilities. The press release stated that the solution would combine AWS Trainium-powered servers, Cerebras chips, and Amazon’s Elastic Fabric Adapter (EFA) networking. The unified solution will be deployed on Amazon Bedrock in AWS datacenters. Later in 2026, Cerebras said it would offer open-source LLMs and Amazon’s Nova portfolio of foundation models powered by Cerebras products.
In Amazon’s announcement of the deal, David Brown, Vice President, Compute & ML Services, AWS, stated:
“Inference is where AI delivers real value to customers, but speed remains a critical bottleneck for demanding workloads like real-time coding assistance and interactive applications. What we're building with Cerebras solves that: by splitting the inference workload across Trainium and [Cerebras chip] CS-3, and connecting them with Amazon’s Elastic Fabric Adapter, each system does what it's best at. The result will be inference that's an order of magnitude faster and higher performance than what's available today."
OpenAI’s announcement about its Cerebras deal also cited inference:
“Integrating Cerebras into our mix of compute solutions is all about making our AI respond much faster. When you ask a hard question, generate code, create an image, or run an AI agent, there is a loop happening behind the scenes: you send a request, the model thinks, and it sends something back. When AI responds in real time, users do more with it, stay longer, and run higher-value workloads.
“We will integrate this low-latency capacity into our inference stack in phases, expanding across workloads.”
NVIDIA itself has been quick to respond to the inference demand. In March it released its NVIDIA Groq 3 LPX Rack, comprising a rack of 256 Groq Language Processing Units (LPUs), which are chips architected for what NVIDIA calls “the low-latency and large-context demands of agentic systems.” The chip is the result of NVIDIA’s $20 billion purchase of inference technology from Groq last year.
Futuriom Take: Cerebras's blockbuster IPO is both a testament to the need for alternatives to NVIDIA’s GPU dominance as well as a nod to growing demand for chips that handle workloads for AI inference.