Google Cloud Next: AI Takes Center Stage

During its Google Cloud Next ’24 conference in Las Vegas this week, Google Cloud unveiled a series of AI infrastructure products and services that demonstrate its intent to diversify its dependence on AI components while facilitating AI inference for enterprise customers.

With the announcements, the hyperscaler means to prove its capability to deliver the AI goods just as well as or better than its rivals Microsoft and AWS, which also sport platforms for AI development. To prove its point, it's parading many impressive customer testimonials and references to illustrate its news.

Leading those announcements is general availability of the company’s TPU v5p, designed for training AI models on the company’s AI Hypercomputer, the company’s AI platform. According to Google Cloud, this new Tensor Processing Unit runs twice as many chips in a single pod or cluster—8,960—and delivers nearly 12X the throughput as its predecessor model. The TPU v5p is also newly compatible with the Google Kubernetes Engine (GKE), and it can serve multiple hosts on GKE. Incidentally, during his keynote remarks, Google Cloud CEO Thomas Kurian noted that usage of TPUs and GPUs on GKE grew over 900% over the past year.

Google Cloud Supports NVIDIA Networking

But Google Cloud’s emphasis isn’t only on its own silicon solutions. The vendor’s also expanded the availability of virtual machine instances – called Google Cloud A3 Mega VMs -- based on NVIDIA GPUs. And plans are underway to add the new NVIDIA Blackwell chips early next year – along with NVIDIA’s advanced networking technologies for those components.

“[NVIDIA Blackwell] B200 and GB200 chips are both powered by next-generation NVIDIA networking,” said Amin Vahdat, Google Cloud VP and GM, Machine Learning, Systems, and Cloud AI, in his keynote remarks. Further, the A3 Mega VMs double the GPU-to-GPU bandwidth of the predecessor A3 VMs, further indicating that Google Cloud is taking advantage of NVIDIA’s NVLink and InfiniBand networking.

Notably, Google Cloud made no announcements regarding the networking for its AI Hypercomputer, dubbed Jupiter, which is an architecture based on optical circuit switching (OCS), wave division multiplexing (WDM), and software-defined networking (SDN). Presumably, Jupiter coexists in Google Cloud with NVIDIA’s NVLink and InfiniBand—just as Google’s TPUs are offered along with NVIDIA’s GPUs. (We say “presumably” since a request for clarification wasn’t answered as of this writing.)

An angle of a Google Cloud AI Hypercomputer. Source: Google Cloud

Going for Diversity

The message is clear: Google Cloud intends to support NVIDIA as well as its own AI chips in Google Cloud, working against the idea that it’s competing with its supplier in AI components and acknowledging NVIDIA’s market strength and dominance. “The rate of progress with NVIDIA GPUs is truly astonishing,” said Vahdat in his keynote.

Google Cloud claims to be rapidly meeting demand for AI infrastructure among major customers. “More than 60% of funded GenAI startups and nearly 90% of GenAI unicorns are Google Cloud customers, including companies like Anthropic, AI21 Labs, Contextual AI, Essential AI, and Mistral AI who are using our infrastructure,” said CEO Kurian in his keynote.

Storage and More AI Infrastructure

Storage plays an important role in AI training and inference, and Google Cloud has made improvements in this space as well. The company is offering “AI optimized storage” in the form of new caching features in Cloud Storage FUSE and Parallelstore, two of the vendor's key storage environments for AI Hypercomputer. It’s also introduced Hyperdisk ML into preview, a block storage service aimed at accelerating the process of loading models into the system for training.

Google Cloud also has upgraded its Dynamic Workload Scheduler, a job scheduling service for AI workloads. Users now have the option of a “flex start” mode for jobs that can run immediately, based on the availability of TPUs and GPUs. A calendar mode allows users to reserve access to AI resources ahead of time – up to eight weeks in advance of running the job. Both modes are in preview.

Extending AI to the Edge

Google Cloud is also focused on the network edge, touting its Google Distributed Cloud (GDC) as a solution for bringing AI to locations formerly problematic due to data sovereignty rules, regulatory compliance, or the need for low latency at the edge. Options include an edge hardware appliance; a hosted service for air-gapped environments; and a virtual, on-premises implementation of Google Cloud.

The details of GDC are complex, but Google Cloud emphasizes the ability for enterprises to run models in it for use with their own data. “GDC’s AI solutions and services provide you with the ability to build on a truly open and comprehensive AI stack on-premises, integrated with the latest models developed by Google, third-parties, and the open source community,” stated Vithal Shirodkar, Google Distributed Cloud and Geo Expansion, Google Cloud, in a blog post for Next.

AI Agents Galore

The emphasis on AI infrastructure at Google Cloud Next ’24 is meant to bolster the broader picture of how the company is helping its customers to create AI applications or agents via Google Cloud -- and not falling behind its rivals AWS and Microsoft.

To prove the point, the company proudly paraded a busload of impressive customer testimonials throughout Tuesday’s keynote speeches, including cameos from the head of Goldman Sachs, Mercedes Benz, and other organizations. Names were dropped liberally: McDonalds, Deutsche Bank, HCA Healthcare, the Mayo Clinic, US Steel, and many more. The company presented intriguing demonstrations of what companies such as these are or could be doing with Google Cloud AI products and services – in areas such as retail, creative marketing, healthcare benefits planning, and developer coding.

Intense Competition

There is much, much more going on this week at Google Cloud Next, including announcements around general-purpose computing. In that category, the vendor revealed plans for its Axion processor, the vendor’s first Arm-based CPU.

But the centerpiece is clearly AI. “We’re at a pivotal moment in our industry,” said CEO Kurian in his keynote remarks. “Together we’re creating a new era of generative AI agents built on a new, truly open platform for AI, and we’re reinventing infrastructure to support it.”

Futuriom Take: Google Cloud refuses to take a backseat to its rivals in the AI race. It's broadened its options to diversify the components of its AI Hypercomputer, supporting its own and NVIDIA's components and networking. And it's aggressively set its sights on leveraging its infrastructure to enable GenAI applications at the enterprise level. If this week's demonstrations are proof, it seems to be succeeding.