Nvidia’s Deal With Meta Signals a New Era in Computing Power
Summary
Nvidia expands beyond GPUs, selling CPUs to Meta in a multi-billion dollar deal. It's targeting the less compute-intensive AI market for agentic AI and inference, alongside its powerful GPUs.
Meta buys into Nvidia CPUs
Meta is buying billions of dollars in Nvidia hardware to power its next generation of AI data centers. The deal marks the first large-scale deployment of Nvidia’s Grace CPU as a standalone chip by a major tech company.
The social media giant will use these chips to support a massive infrastructure roadmap focused on both training and inference. Meta expects to integrate millions of Nvidia’s Blackwell and Rubin GPUs into its systems over the next several years. This expansion solidifies a long-standing partnership between the two companies as Meta scales its internal AI capabilities.
Meta previously estimated it would own 350,000 H100 GPUs by the end of 2024. The company aims to have access to a total of 1.3 million GPUs by the end of 2025. While Meta has not confirmed if all those chips will come from Nvidia, this new deal suggests Nvidia remains its primary supplier.
The financial scale of this commitment is massive. Meta recently raised its planned infrastructure spending for this year to a range between $115 billion and $135 billion. This is a significant jump from the $72.2 billion the company spent on hardware and data centers last year.
Agentic AI changes the hardware
Industry analysts suggest the shift toward standalone CPUs stems from the rise of agentic AI. These software agents require general-purpose processing power to manage complex workflows and interact with GPU architectures. Creative Strategies CEO Ben Bajarin notes that traditional cloud applications and new AI agents both put heavy demands on CPU performance.
CPUs act as a critical link in the data center, preventing bottlenecks during high-speed computations. Microsoft recently reported that its data centers for OpenAI now require tens of thousands of CPUs to manage petabytes of data generated by GPUs. Without these processors, the GPUs would sit idle while waiting for data to move through the system.
Nvidia is positioning itself to provide a "soup-to-nuts" approach to this compute power. By selling the Grace CPU alongside its GPUs, Nvidia ensures that its proprietary interconnects link every part of the server rack. This strategy locks customers into a full Nvidia ecosystem for both simple processing and heavy AI training.
The technical requirements for inference—the process of running a trained model—differ from the requirements for training. Training requires raw, parallel power, while inference often benefits from low-latency, efficient processing. Nvidia CEO Jensen Huang estimated in 2022 that inference already accounted for 40 percent of the company's business.
Nvidia spends big on inference
Nvidia is aggressively protecting its lead in the inference market through massive acquisitions. The company recently spent $20 billion to license technology from Groq, a startup specializing in high-speed AI inference. This represents Nvidia's largest single investment to date.
The deal brings Groq CEO Jonathan Ross and his engineering team into Nvidia’s orbit. Groq’s technology focuses on LPU (Language Processing Unit) architectures that offer lower latency than traditional chips. Nvidia plans to use this intellectual property to make its own hardware more competitive for low-cost, high-performance inference tasks.
The Groq acquisition signals that Nvidia views specialized inference startups as a legitimate threat. While Nvidia dominates the training market, competitors are finding niches by offering faster response times for consumer-facing AI apps. By absorbing Groq’s tech, Nvidia aims to maintain dominance across the entire AI lifecycle.
Nvidia’s hardware roadmap now includes several tiers of specialized chips:
- H100 and H200: The current industry standard for training large language models.
- Blackwell: The next-generation architecture designed for massive scale-up.
- Rubin: The future platform scheduled for release in 2026.
- Grace CPU: High-bandwidth ARM-based processors designed to work with GPUs.
- Vera: A new CPU architecture designed to pair with Rubin GPUs.
The fight for custom silicon
Meta’s massive purchase comes as other tech giants try to break their dependence on Nvidia. Google primarily uses its own Tensor Processing Units (TPUs) to train its Gemini models. Google has even reportedly discussed selling these custom chips to Meta to compete directly with Nvidia’s sales.
Amazon and Microsoft are also shipping their own custom AI silicon to reduce costs. Microsoft uses a mix of Nvidia GPUs and its internal Maia chips for Azure cloud services. These companies want to avoid the high margins and supply constraints associated with Nvidia’s hardware.
Anthropic currently uses a diversified hardware stack to train its Claude models. The AI lab relies on a combination of:
- Nvidia GPUs for primary model training.
- Google TPUs through its partnership with Google Cloud.
- Amazon’s Trainium and Inferentia chips.
This diversification is a hedge against supply chain volatility. Anthropic CEO Dario Amodei has been a vocal critic of Nvidia’s market influence. He previously criticized Nvidia's lobbying efforts to ease export restrictions on advanced chips to China.
OpenAI diversifies its chip supply
OpenAI is pursuing one of the most aggressive diversification strategies in the industry. While the company signed a $100 billion infrastructure deal with Nvidia last year, it is actively seeking other partners. CEO Sam Altman has held public talks with several of Nvidia’s direct rivals.
In June, Altman appeared with AMD CEO Lisa Su to announce a major partnership. OpenAI agreed to purchase up to 6 gigawatts of compute power using AMD chips over several years. The deal includes an option for OpenAI to acquire a 10 percent stake in AMD, signaling a deep long-term commitment.
OpenAI is also working with Broadcom to design custom AI hardware and networking systems. This move mimics the strategy used by Google to build the TPU. By designing its own chips, OpenAI can optimize the hardware specifically for the GPT architecture.
Two weeks ago, OpenAI announced a $10 billion deal with Cerebras. This partnership adds 750 MW of ultra-low-latency compute to OpenAI’s platforms. Cerebras is known for its "wafer-scale" engines, which are massive single chips designed to handle AI workloads faster than traditional clusters.
Despite these moves, Nvidia’s revenue continues to grow as demand outstrips the total global supply of silicon. Hyperscalers like Meta and Microsoft are buying every chip they can find, regardless of the manufacturer. For now, Nvidia’s ability to provide a complete system of CPUs, GPUs, and networking keeps it at the center of the AI build-out.
Related Articles
Is social media addictive?
Meta CEO Zuckerberg faces court over claims its platforms harm youth via addictive features, amid global moves to restrict kids' social media access.
Why 40% of AI projects will be canceled by 2027 (and how to stay in the other 60%)
Many AI projects fail due to siloed efforts on speed, cost, and security. Success requires a unified AI connectivity platform that integrates all three for sustainable deployment.
Stay in the loop
Get the best AI-curated news delivered to your inbox. No spam, unsubscribe anytime.
