Meta Expands Nvidia Deal for 'millions' of AI Chips

Meta is going bigger with Nvidia, and it is not just GPUs. The two companies signed a multiyear deal for "millions" of chips plus networking to power Meta's U.S. AI data centers. Financial terms were not disclosed; analyst Ben Bajarin estimates the deal is "in the tens of billions." (investing.com)

What’s actually in the box:

Grace CPUs (Arm-based server chips) running "standalone" - that means CPU-only servers without paired Nvidia GPUs. Arm-based CPUs use a processor design focused on energy efficiency. In plain terms: Meta will move a lot of inference work (running trained AI models to produce outputs) and agent-style tasks to cheaper, lower-power CPUs. (NVIDIA News)
Vera Rubin rack systems - Nvidia's next-generation rack-scale AI platform that bundles Rubin GPUs and Vera CPUs. These racks will come online through partners in the second half of 2026, with large-scale Vera CPU use eyed for 2027. In short: racks that combine fast AI accelerators and efficient CPUs for big-model training and serving. (NVIDIA Investor)
Spectrum-X Ethernet - Nvidia's AI networking stack to connect thousands of accelerators inside clusters. This is the fabric that helps all those chips talk quickly and reliably. (NVIDIA Spectrum-X)

Why this matters (straight from Zuck):

Meta says it will use Nvidia's Vera Rubin platform "to deliver personal superintelligence to everyone in the world." That phrase summarizes the company's long-term goal: make powerful, personal AI assistants broadly available. (NVIDIA News)

Timeline and scale:

Multiyear rollout across Meta's AI campuses. Rubin hardware starts arriving in the back half of 2026; Vera CPU scale-up is targeted for 2027. (NVIDIA Investor)
Two flagship sites under construction: Prometheus (a 1-GW supercluster in New Albany, Ohio, slated to start coming online in 2026) and Hyperion (Richland Parish, Louisiana, planned to scale toward 5 GW through 2030). A note on the units: GW means gigawatts of power capacity for large data-center campuses. (CBS News)

Market and context:

Stock reaction: Nvidia and Meta ticked up after-hours; AMD's shares fell about 4% on the headline. (investing.com - markets)
This is not a sudden move. Meta has used Nvidia for years, but it also runs AMD Instinct for some Llama inference (Llama is Meta's family of large language models) and develops its own custom AI silicon. There have also been reports Meta may consider Google TPUs (tensor processing units, Google's AI accelerators) starting in 2027. The takeaway: this is not exclusive lock-in, but a major doubling down where Meta expects the most value. (AMD News)

Why founders should care:

Capacity squeeze - plan accordingly. Expect longer GPU lead times and volatile pricing as big buyers scale up. Short-term moves: push quantization (smaller numeric formats), batching, and sparsity to lower inference costs.
Midterm options - check AMD Instinct and Google TPU capacity and pricing as part of your cost strategy.
Architecture shift - CPU-heavy inference is making a comeback. If your workload fits CPU-focused serving, Grace-class CPUs plus smarter serving software can cut real costs.
Strategic partnerships - consider teams that improve token efficiency per watt (better compilers, pruning, caching) so you use fewer compute cycles for the same results. (NVIDIA News)

Bottom line:

Meta just validated Nvidia's end-to-end stack - CPUs, GPUs, and the network fabric - at hyperscaler scale. That move will ripple across the AI infrastructure market and affect everyone building AI products today. (NVIDIA News)

Meta Doubles Down on Nvidia, 'millions' of AI Chips

Enjoyed this article?