The Reseller

A dusk railway junction with two platforms labeled TPU and NVIDIA converging on a single station, amber lamps and tracks in the foreground

Google started building its own AI chips in 2013 to stop paying Nvidia. On April 22, it shipped the eighth generation of those chips. The same day, Mira Murati's lab signed a multi-billion-dollar deal with Google Cloud — for access to Nvidia's newest chips instead.

Two announcements from the same company, pointing in opposite directions. Both are the same bet.

What the Chip Was For

In 2013, Jeff Dean and a handful of Google engineers did the math on what it would cost to serve speech recognition for every Android phone if the company kept renting GPUs. The answer was that Google would need to double the size of every data center it owned. The project that followed was named Tensor Processing Unit. The first generation went live in Google's fleet in 2015, running inference for Search and Translate. Google didn't acknowledge it publicly until Sundar Pichai mentioned it at I/O in 2016.

The design envelope was narrow. The TPU existed to do one thing cheaply: run Google's own neural networks on Google's own workloads, at Google's own scale, without paying Jensen Huang a margin on every inference. It was the opposite of a product. It was a cost center that paid for itself by not being a cost.

In 2018, Google made Cloud TPUs generally available. That was the first reversal — the internal escape hatch became an external product line. The logic still held: every TPU hour Google sold to a startup was an hour that didn't need a Nvidia GPU. The chip kept doing its original job even as the audience widened.

The Shift

Something changed between 2022 and 2025, and what changed wasn't the TPU. The TPU kept improving — v4, v5e, v5p, the Trillium generation, Ironwood. What changed was the shape of the market around it.

When OpenAI and Anthropic and a dozen frontier labs started buying compute the way airlines buy jet fuel — in multi-year, multi-gigawatt blocks — the question for any hyperscaler stopped being "can we build our own silicon?" and became "what currency do our customers want to be billed in?" Some customers wanted TPU hours. Most wanted CUDA. The pragmatic answer was to sell both.

In October 2025, Google and Anthropic announced a cloud deal worth tens of billions of dollars, giving Anthropic access to one million TPUs and a gigawatt of capacity. Anthropic was one of the first frontier labs to anchor a workload on Google's silicon at scale. The press release read like a competitive win over Nvidia. In practice, it was a routing decision: Anthropic's budget now flowed through Google Cloud, and Google — not the chip — was the counterparty.

Six months later, Mira Murati's Thinking Machines Lab signed a deal with the same cloud, for the same dollars, for the opposite chip. Google Cloud is a channel now. What flows through it is whatever the customer writes on the purchase order.

2013 Google begins TPU project. ASIC designed to serve Google's own neural networks at lower cost than Nvidia GPUs.
2015 TPU v1 deployed in Google's data centers, running inference for Search and Translate.
2016 Sundar Pichai publicly reveals TPU at Google I/O.
2018 Cloud TPU GA. The internal escape hatch becomes an external product.
OCT 2025 Google–Anthropic cloud deal: 1M TPUs, 1GW of capacity, tens of billions of dollars.
APR 2026

TPU v8 splits into 8t (training) and 8i (inference). Thinking Machines Lab signs Google Cloud deal for Nvidia GB300.

Three Roles

By April 22, 2026, Google was running three roles in the AI compute market simultaneously.

The first is architect. The TPU v8 split — 8t for training, 8i for inference — is not the move of a company hedging its silicon program. A single chip family splitting into workload-specific SKUs is what a committed silicon vendor does when it has enough internal and external demand to justify separate roadmaps for each. Intel split Xeon and Atom for the same reason. Google did it with TPU on April 22.

The second is reseller. The Thinking Machines deal put Google Cloud in the position every hyperscaler actually occupies underneath the marketing: the largest single buyer of Nvidia capacity, reselling it to everyone who couldn't or wouldn't buy it directly. Google, AWS, and Azure together purchase a majority of Nvidia's data center output. The TPU narrative describes Google as Nvidia's competitor. The cloud P&L describes Google as Nvidia's distribution.

The third is customer. Google disclosed the same day that 75% of new code inside the company is now AI-generated — up from 50% the previous fall. The models writing that code run on Google's own infrastructure, mostly on TPUs. The silicon Google designed to escape Nvidia is now inside the loop that writes Google's software.

Each of the three roles subsidizes the other two. TPU revenue finances Nvidia procurement. Nvidia rentals finance the margin that funds TPU R&D. Internal TPU usage calibrates what the next TPU generation needs to do. Isolate any one and the economics get harder. Combine them and Google is the only company on earth that occupies all three positions in the market it helped create.

The Pattern

Railroads ran this arc a hundred and fifty years ago. They built their own track, their own rolling stock, their own signaling standards, their own terminals — and then sold trackage rights to competing lines, leased terminal space to rival shippers, and collected fees on every car that crossed their bridges regardless of whose locomotive pulled it. Infrastructure is capital-intensive and indivisible. You build it for one customer — yourself — and the marginal cost of serving the second customer is close to zero. If the second customer happens to be your competitor, you charge them more, not less. Once the track exists, every flow through it is revenue.

The dependency that TPUs were built to escape — paying Nvidia for every inference — still exists. It has simply been moved from Google's cost line to Google's revenue line. When Anthropic rents TPUs, Google wins. When TML rents Nvidia chips through Google Cloud, Google wins. When Google writes its own code on its own chips, Google wins. The three roles are not a portfolio hedge. They are the same tollbooth, collecting different tolls.

The Reseller

TPU v1 shipped in 2015 to save money Google would have spent on Nvidia. TPU v8 shipped in 2026 to defend a market position that now includes being Nvidia's cloud reseller. The original purpose — reduce dependency on a company Google didn't control — survived as a line item on the data center budget. The larger purpose — own the infrastructure every AI workload passes through, regardless of whose chip it runs on — is the one Google actually built.

Mira Murati's deal is the evidence. She left OpenAI to found Thinking Machines Lab, hired from OpenAI, raised at a premium, and picked her infrastructure before she picked her product. She went to the same cloud Anthropic went to — and bought the opposite chip. Google took the order either way.

Eleven years ago, the TPU was a defection from Nvidia. Today, the TPU is one of Google's two SKUs on the same shelf. The other one says Nvidia on the label, and Google is the retailer.