Broadcom's Tomahawk 6 Is Now Shipping: The World's First 102.4 Tbps Switch That Will Redefine AI Cluster Design

Extreme close-up of a large, complex semiconductor switch chip with dense circuit pathways and glowing cyan fiber optic cables, mounted on a dark circuit board in a hyperscale data center environment

On March 12, 2026, Broadcom began shipping the Tomahawk 6 — the world's first switch chip to break the 100-terabit barrier — delivering 102.4 Tbps of switching capacity in a single device. It doubles the throughput of its predecessor, the Tomahawk 5, and does so while shrinking the number of switch tiers required to interconnect massive AI accelerator clusters. For the hyperscalers racing to wire together AI factories with hundreds of thousands of GPUs, this chip isn't a spec upgrade. It's a topology revolution.

Why 100 Terabits Is a Threshold, Not a Spec Sheet

The data center networking landscape for AI has changed faster than almost any other segment of the semiconductor industry. Three years ago, a 51.2 Tbps switch was a landmark. By 2025, that figure had become a bottleneck. As AI training clusters scaled from thousands to hundreds of thousands of accelerators, the network connecting them — not the chips themselves — emerged as the new chokepoint.

The problem is topological. Every switch tier added to a network introduces latency, increases the number of optical transceivers required, and multiplies the management surface. For AI training workloads in particular, where thousands of GPUs must synchronize gradient updates across every iteration, even microseconds of added latency compound into measurable slowdowns in job completion time. The math is brutal: a 3% latency increase across a 128,000-GPU training job translates into billions of dollars of wasted compute time annually at hyperscale.

The Tomahawk 6's 102.4 Tbps capacity addresses this directly. With twice the switching density of its predecessor, it reduces the number of switch tiers needed to interconnect large clusters — and that, more than the raw throughput number, is the story that matters to the engineers building next-generation AI infrastructure.

Inside the Chip: 3nm, Chiplets, and Cognitive Routing 2.0

Tomahawk 6 is built on TSMC's 3nm process using a modular chiplet architecture for both packet processing and SerDes — a departure from the monolithic approach Broadcom used through the Tomahawk 5. The shift to chiplets allows the company to improve port density without a linear increase in power draw, which is critical in AI racks already operating at their thermal limits.

The switch supports two SerDes configurations: either 512 lanes of 200G or 1,024 lanes of 100G on a single chip. The 200G option provides headroom for longer passive copper reach without active optical components, while the 100G configuration maximizes the number of endpoints a single chip can serve. Both options are compliant with Ultra Ethernet Consortium (UEC) specifications, which are rapidly becoming the open-standard framework for AI cluster networking.

On the software side, Broadcom introduced what it calls Cognitive Routing 2.0 — an adaptive load-balancing and congestion management system with fine-grained per-flow telemetry. In AI training environments, congestion events are highly bursty and correlated: all-to-all communication patterns during gradient synchronization saturate specific switch ports simultaneously. Cognitive Routing 2.0 is designed to detect and reroute around these hotspots in real time, maintaining network utilization at levels that minimize job completion time rather than just raw throughput.

Broadcom also offers a co-packaged optics (CPO) variant of the Tomahawk 6, branded as Davisson CPO. By integrating optical silicon directly onto the switch package rather than relying on pluggable transceivers, the CPO design reduces power consumption, shortens the electrical path to the fiber, and lowers total system cost at scale. Broadcom is showcasing Davisson alongside more than 30 ecosystem partners at the Optical Fiber Communications Conference (OFC) 2026 in Los Angeles this week.

Fewer Tiers, Bigger Clusters: The Topology Payoff

The most direct operational consequence of Tomahawk 6's doubling of switch bandwidth is what it does to fat-tree network topologies — the dominant design for large-scale AI cluster fabrics. In a fat-tree built with Tomahawk 5, connecting 128,000 accelerators required three switch tiers: a leaf layer, a spine layer, and a superspine layer. Every additional tier means more optical components, more power draw, more potential failure points, and more latency.

With Tomahawk 6, that same 128,000-XPU scale-out network collapses to just two switch tiers. According to Broadcom's announcement, the chip enables a 128K-XPU fabric with only a leaf and spine layer — eliminating an entire tier of hardware, reducing optics count, simplifying load balancing, and delivering lower all-reduce latency in a single architectural decision.

For scale-up workloads — where a tightly coupled group of accelerators need all-to-all connectivity within a single job — a single Tomahawk 6 chip can connect 512 XPUs with single-hop all-to-all communication. That's a dramatic improvement over what was previously achievable without multi-hop routing, and it directly benefits the most latency-sensitive AI inference workloads being deployed in production today.

Broadcom describes the chip as designed to meet demand for AI clusters with more than one million XPUs. That's not a configuration that exists today, but it's a target that hyperscalers have been publicly discussing for 2027 and 2028. The Tomahawk 6 is being designed in now so it can be deployed at that scale when the time comes.

Ethernet's Moment Against InfiniBand

The Tomahawk 6's launch comes at a decisive inflection point in the Ethernet-versus-InfiniBand debate that has defined AI cluster networking for the past three years. NVIDIA's InfiniBand platform — particularly the Quantum-X800 (Quantum-3 switch) — also offers 102.4 Tbps of raw switching capacity. The two technologies now stand at near-parity on the headline metric that dominated previous competitive comparisons.

The battleground has shifted. Ethernet's structural advantage has always been openness: multiple vendors, interchangeable components, no dependency on a single supplier's networking stack. For hyperscalers that have spent years building proprietary network management tooling on top of standard IP protocols, Ethernet offers a degree of control and cost predictability that InfiniBand cannot match. NVIDIA's acquisition of Groq — which TTN covered yesterday in the context of GTC 2026 — has further concentrated concern among hyperscalers about the long-term cost of a vertically integrated NVIDIA compute-and-networking stack. Every dollar spent on InfiniBand is a dollar that flows to the same company supplying the GPUs.

The Ultra Ethernet Consortium, which counts AWS, Meta, Google, Microsoft, AMD, Intel, and Broadcom as founding members, is making this structural preference explicit. UEC specifications are designed to bring RDMA-class latency performance to standard Ethernet fabrics, and the Tomahawk 6's full compliance with those specifications means it can serve as the backbone of a hyperscaler's AI cluster without requiring any proprietary network components.

An Open Ecosystem: OCI and 30+ Partners

Broadcom is moving aggressively to build the ecosystem layer around Tomahawk 6, recognizing that a switch chip alone doesn't win large enterprise or hyperscale deployments. The company helped launch the Optical Compute Interconnect (OCI) agreement — an industry initiative to establish a common specification allowing networking hardware and optical components from multiple vendors to interoperate within AI cluster fabrics.

At OFC 2026, Broadcom is demonstrating interoperability with more than 30 partners spanning switch silicon, optics, NICs, and cable vendors. Alongside the optical ecosystem play, Broadcom is partnering with JetCool — a unit of Flex Ltd — to develop direct liquid cooling systems for Tomahawk 6 deployments. As switch chips pack more bandwidth into tighter packages, thermal management is becoming a first-order design constraint. JetCool's approach removes heat directly from the chip surface rather than relying on airflow, enabling the kind of dense switching fabrics that the next generation of AI infrastructure requires.

The Roadmap and Revenue Stakes

Broadcom's public roadmap for the Tomahawk series extends well beyond the current generation. Industry analyses point to a Tomahawk 7 targeting 204.8 Tbps and a Tomahawk 8 projected at 409.6 Tbps — both already in view for the largest AI infrastructure builders. The cadence of doubling capacity per generation has held consistent across the Tomahawk line, and Broadcom's rapid production ramp of the Tomahawk 6 — from initial samples to full production volume in less than three quarters — suggests the engineering organization is operating at a clip that matches the urgency of its largest customers.

The financial stakes are substantial. Broadcom reported record Q1 FY2026 results on March 4, driven by AI infrastructure demand, and announced a $10 billion share buyback authorization alongside the results. Analysts tracking the company project AI-related revenue could reach $10 billion to $11 billion in the April quarter alone, potentially exceeding $65 billion for the full fiscal year 2026 and surpassing $120 billion by fiscal 2027 as the Tomahawk 6 ramps across hyperscaler deployments and new XPU programs scale in parallel.

Those projections reflect a market reality that is difficult to overstate: the build-out of AI infrastructure has moved from proof-of-concept to industrial-scale capital deployment. The hyperscalers spending hundreds of billions of dollars on AI factories need every component of those factories — including the network fabric — to keep pace with accelerator roadmaps. Broadcom, with Tomahawk 6 now in production, is staking a claim to own the most critical interconnect layer in that stack.

What This Means for AI Infrastructure Builders

For engineering teams designing AI cluster networks today, the Tomahawk 6's availability changes the calculus on both near-term build-outs and long-range planning. The ability to collapse 128K-XPU fabrics from three tiers to two is an immediate, quantifiable cost reduction — fewer switches, fewer transceivers, lower power draw, and a simpler management surface. For clusters being designed now for deployment in 2026 and 2027, the Tomahawk 6 is almost certainly the reference switching platform.

The more interesting question is what happens at the million-XPU scale Broadcom is explicitly targeting. At that scale, even a two-tier Ethernet fabric built on 102.4 Tbps switches will require careful engineering to maintain the all-reduce performance that makes distributed training viable. The combination of Cognitive Routing 2.0's adaptive congestion management, the ultra-high-density SerDes options, and the forthcoming co-packaged optics deployment at OFC 2026 suggests Broadcom has thought carefully about that challenge.

The speed of execution — sampling to production in under three quarters for a chip of this complexity — also signals something about competitive positioning. In the current AI infrastructure arms race, being first to production at each bandwidth threshold matters as much as the bandwidth itself. Broadcom is first to 102.4 Tbps in Ethernet. The next threshold is 204.8 Tbps. The race is already on.

Related Articles