Qualcomm Wants to Be an NVIDIA Alternative Too

AMD has some more company: Qualcomm has announced its own accelerator chips to compete with NVIDIA. The AI200 and AI250 are due to arrive in 2026 and 2027, respectively.

Qualcomm’s stock rose 11% after the announcement yesterday morning. It’s given back some gains but, as of this writing, is still up 7% overall compared to the pre-announcement price.

The announcement pushes Qualcomm into the big-iron AI fight. We all know Qualcomm for small chips that go into small spaces like cellphones. The company also makes the Snapdragon processors that find their way into laptops. Now it’s going to have to compete at rack scale, because that level of integration and modularity are part of the new rules for AI datacenters. That goes for customers including neoclouds, public clouds, even sovereign and enterprise clouds.

NPUs at Rack Scale

Qualcomm already has some AI expertise under its belt, in the form of the Qualcomm Cloud AI100 Ultra accelerator card. It targets inferencing, and its customers include Cirrascale and Core42. It’s also used in AWS’s DL2q deep learning inference instances.

The new chips are the followup generation, still targeting inference but now suitable for datacenter-sized deployments. The Qualcomm AI200 and AI250 are billed as card- and rack-scale accelerators, due to arrive in 2026 and 2027, respectively.

The chips aren’t GPUs. Like the AI100, they are neural processing units (NPUs) based on Qualcomm’s Hexagon architecture. Hexagon was drafted specifically for inference, in contrast to the generalized parallel computing of GPUs. They’re designed for cloud and AI factory scale, though, with features including 768 GB of memory per card.

The AI250 will add near-memory computing, promising better performance. The connection between processors and memory is a severe bottleneck in AI, as it can result in GPUs or NPUs sitting idle while waiting for data.

A Giant Leap

A bigger chip is a logical next step after the AI100, and there’s certainly plentiful capital being spent on AI factories. But in going to datacenter scale, Qualcomm pretty much has to start operating at rack scale. That’s going to be the way to sell these chips in immense volumes: They need to go into pre-integrated racks that get installed in modular or prefab datacenters.

NVIDIA already does this, of course. AMD will follow suit with rack-scale products around the upcoming Instinct MI400 chip. The inferencing chip startup d-Matrix announced rack-scale products at the recent Open Compute Project Summit. Going even further, Cerebras and Groq emerged as chip vendors at first but now build entire clouds for their accelerators.

Clearly, Qualcomm has partners providing things like liquid cooling and storage. That’s not the issue; the question is how well Qualcomm will do at selling equipment at the scale of integrated racks, which is more the expertise of a Dell or a Supermicro. It’s going to take a lot of partnership coordination followed by an uphill marketing effort against NVIDIA and AMD. (Another possibility is that Qualcomm gets an ODM partner to do the bulk of the work.)

Inference and the Edge

The industry is hungry for NVIDIA alternatives, however. AI factories are where the volume is, but other emerging angles could play in Qualcomm’s favor. Inferencing will be the bulk use case for AI, and one common theory is that inferencing will need to be distributed—placed near or right on top of the data. That’s particularly true in far-edge and IoT use cases. Think industrial, medical—or wireless.

So, one longer-term possibility is that AI200s and their successors sit in small clusters at edge sites. Those sites might require a handful of servers, or they might require a few racks, which Qualcomm would be ready to provide. This kind of edge environment is more familiar territory for the company.

In any event, by launching the AI200, Qualcomm has pressed itself to join the rack-scale game, and it’s going to seem unusual until the company establishes that it can not only build but also sell at that level.

Trending Articles

ANALYSIS OF EMERGING CLOUD TECHNOLOGIES

Qualcomm Wants to Be an NVIDIA Alternative Too

NPUs at Rack Scale

A Giant Leap

Inference and the Edge

Related Articles