ANALYSIS OF EMERGING CLOUD TECHNOLOGIES

Access Cloud Tracker Pro

Cloud Tracker Pro

NVIDIA Ethernet Drives xAI's Supercomputer

Datacenterblue446

In news that flew a bit under the radar, NVIDIA announced last month that xAI’s gigantic new supercomputer in Memphis, Tennessee—called Colossus—is using NVIDIA Spectrum-X Ethernet networking to connect 100,000 NVIDIA Hopper GPUs.

xAI, of course, is the company headed by Elon Musk (who is also CEO of Tesla and SpaceX). In its mission to “understand the universe,” xAI is training a series of large language models (LLMs), including Grok-0, Grok-1, and Grok-2.

The news of NVIDIA's role in Colossus is significant on several fronts. It underscores NVIDIA’s dominance in the supercomputing market. It also illustrates the strength of NVIDIA’s brand of Ethernet networking, which retains the interoperability of Ethernet while supporting the demands of AI.

Demands of the Largest Supercomputer

In the case of Colossus, claimed by xAI and NVIDIA to be the world’s largest supercomputer, AI’s demands are enormous. To build Colossus, xAI recruited some of the world’s top engineers. The cluster comprises 100,000 H100 GPUs, configured in HGX servers containing eight GPUs each. (That’s for now: The cluster will soon be expanded by another 100,000 GPUs.) The servers are housed in Supermicro liquid-cooled racks, with 64 GPUs per rack. There are a total of 1,500 GPU racks in the Colossus cluster.

To access the rest of this article, you need a Futuriom CLOUD TRACKER PRO subscription — see below.

Access CLOUD TRACKER PRO

Subscribe for Access
Activate your CLOUD TRACKER PRO Subscription,
$48/month or $528/year per individual.
Click Here to

CLOUD TRACKER PRO Subscribers — Sign In
Subscribers please Click Here to Login.

Related Articles

Trending Articles

Handshake

Intel and NVIDIA Hook Up

NVIDIA's stunning investment in Intel and the companies' joint collaboration could shift the fortunes of both firms —UPDATED 9/18 4:00 PM...

A Ichip2

Groq Raises $750M for GroqCloud

Maverick chipmaker Groq now has a valuation of $6.9 billion and is rapidly building out cloud datacenters using its processors...

Datacenter350

Fears Grow About OpenAI Deals with NVIDIA, Oracle

NVIDIA pledges $100 billion to OpenAI, and Oracle stock goes up. Are we deep into bubble territory?...

China US Chip 53695

China Has Banned NVIDIA. Did Anybody Notice?

China's ban on NVIDIA chips accompanies announcements of advanced domestic designs. But making chips in the PRC could have limits...

Networkglobe3

WEKA's AI Push Targets Inferencing Tokenomics

This year has seen WEKA make major product moves to better serve AI and neoclouds, with a message that's focusing on the economics of inference...

Cloud datacenter 2

Arista Stakes Its Claim in Neocloud Territory

Arista says its success with hyperscalers can apply to AI factories, with neoclouds eager to tap that expertise...

Handshake

AMD Expands Market With OpenAI Deal

AMD's highly structured, multibillion-dollar deal with OpenAI could boost AMD's market share and help it post $100 billion in revenue over the next several years. It also may be the most sensible deal OpenAI has done this year...

Top AI Platforms By Use Case

Futuriom's Enterprise AI Index Has Dropped!

It's live! Futuriom Cloud Tracker Pro's Enterprise AI Index tracks real-world enterprise AI deployments...

Datacenter power burst

How the Datacenter Power Issue Is Being Addressed

As datacenter buildouts hit the headlines, work is underway to address the power consumption issues they raise...

Cropped H1 B

How Will H-1B Visa Changes Impact Tech Firms?

The latest executive order from Washington has roiled tech firms nationwide; long term, it could leave the U.S. at a disadvantage...

Walmart shopper 2

Enterprise AI Profile: Walmart

Walmart's years-long AI deployment makes it a bellwether for the direction AI is taking in other large retail operations...

Security3Shield

F5 Beefs Up AI Story with CalypsoAI Acquisition

With a methodical acquisition strategy, F5 has tied together API and AI security with networking...

Clouddata

Trends and Leaders in Platform Engineering and IaC 2025

Increasing concerns about cybersecurity, data privacy, and AI are likely to accelerate the need for platform engineering....

Employee and chatbot

Enterprise AI Profile: Nestlé

Nestlé created its own chatbot to improve employee productivity, strengthen customer support, and advance supply chain operations ...

Networktiming

Clockwork Helps GPUs Keep Busy

Clockwork's FleetIQ delivers a "Waze" for GPU clusters, thanks to a software-defined clock synchronization and rerouting technique...

A Idataflow

Komprise Streamlines Data for AI

Targeting the RAG market, unstructured data management firm Komprise has added an intelligent data ingestion engine...

AI doc and patient

Enterprise AI Profile: Beacon Health System

The Beacon Health System in Indiana deployed AI to substantially reduce physican burnout while creating operational efficiencies...

DC8

AI Changed the Datacenter Rules, as Seen at OCP Summit

Rapid deployment and intense power demands make AI datacenters different. Vendors and the Open Compute Project have responded in kind...

×

Get incisive independent analysis of networking and cloud technology directly to your inbox every two weeks.