NVIDIA has dominated AI compute for the better part of three years. Its GPUs power the vast majority of model training and increasingly own inference workloads. But in early 2026, the competitive landscape is shifting — and NVIDIA's own moves are the clearest signal of how seriously it takes the threat.
In December 2025, NVIDIA announced its largest acquisition ever: purchasing Groq's assets for approximately $20 billion in cash. Groq, the inference-focused chip startup behind the LPU (Language Processing Unit) architecture, had raised $750 million at a $6.9 billion valuation just three months earlier. Groq founder Jonathan Ross and senior leaders joined NVIDIA to help integrate the LPU inference technology into NVIDIA's Blackwell and upcoming Rubin architectures. One analyst noted the deal is "structured to keep the fiction of competition alive" — a pointed observation about industry consolidation.
The Hyperscaler Chip Programs
Every major cloud provider now designs its own AI silicon, and the latest generation is genuinely competitive for specific workloads.
Google released its 7th-generation TPU, codenamed Ironwood, in November 2025 — a full decade after making its first custom AI ASIC in 2015. The competitive significance became clear when Anthropic closed the largest TPU deal in Google Cloud history, committing to hundreds of thousands of Trillium (6th-gen) TPUs in 2026 and scaling toward one million by 2027. For large-scale inference, Google's TPUs increasingly offer superior price-performance.
Amazon's Trainium continues to advance. Trainium3 enters preview and moves toward full deployment in early 2026, promising double the performance of Trainium2 with 40% better energy efficiency on TSMC's 3nm process. Microsoft's Maia accelerator and Meta's MTIA chips are also in active deployment for internal workloads.
The Startup Landscape — Consolidation Begins
The $20 billion Groq acquisition marks what many see as the end of the "wild west" era for AI chip startups and the beginning of market consolidation dominated by a few super-cap players.
- Cerebras: Raised $1.1 billion at an $8.1 billion valuation in September 2025. After withdrawing an IPO filing in October, the company is now rumored to be targeting a Q2 2026 IPO with a valuation floor of $20 billion. Its wafer-scale engine continues to find customers in scientific computing and large-scale training.
- Etched: Its transformer-specific ASIC has generated buzz for hardwiring the transformer architecture directly into silicon rather than relying on programmable general-purpose compute.
- d-Matrix: Focused on in-memory computing for inference, targeting the edge and data center inference market.
NVIDIA's Position
NVIDIA's Blackwell architecture (B100/B200) shipped in 2025 with roughly 2.5x the speed and 25x better energy efficiency than its predecessors. The company has visibility toward $500 billion in combined Blackwell and Rubin AI chip revenue through calendar 2026, with cloud GPU capacity sold out. Blackwell sales have been described by management as "off the charts."
NVIDIA's deepest moat remains its full-stack ecosystem: CUDA, networking (InfiniBand/NVLink), software libraries, and the largest AI developer community in the world. For raw training performance on diverse frameworks, Blackwell remains difficult to beat. The switching costs for organizations with millions of lines of CUDA-optimized code are staggering.
Our Take
The AI chip market is fragmenting and consolidating simultaneously. Custom silicon from hyperscalers is winning inference workloads on price-performance. Startups are either being acquired (Groq) or racing to IPO (Cerebras) before the window closes. But NVIDIA's ecosystem lock-in remains formidable, and the Groq acquisition only strengthens its inference story.
For enterprises, chip selection is becoming a strategic decision. The days of defaulting to "just buy NVIDIA" are numbered for inference — but for training, NVIDIA's dominance shows no signs of cracking. The most sophisticated buyers are already running heterogeneous compute environments, matching workloads to the most cost-effective silicon. That trend will only accelerate.