Why Google might win? (2/5)

On infrastructure

Mar 04, 2026

In part 1, I wrote about how Google makes its own AI chips cheaper than anyone can buy Nvidia’s. That’s a real advantage, but chips are only part of the story. A chip sitting in a poorly-cooled data centre, drawing expensive power, transmitting data across rented pipes, is not cheap. The TPU advantage only compounds if the infrastructure around it is efficient.

I’m calling this the infrastructure layer. It has five components: grid access, power reliability, power efficiency, intra-data centre networking, and inter-data centre networking. On most of these, Google is strong–but not uniquely so. Other hyperscalers secured grid connections decades ago. Everyone is diversifying into nuclear. Meta is building subsea cables at a pace that rivals Google’s.

One metric, though, tells a different story.

When a data centre buys $100 of electricity, not all of it reaches the processors. A large chunk gets burned on cooling, power conversion, and overhead. The ratio of total power consumed to power that actually does useful computation is called PUE. Lower is better. A perfect score is 1.0.

The industry average is 1.56. For every $100 of electricity, $36 is wasted.

Google’s number is 1.09. Microsoft is 1.16. Amazon is 1.15.

And then there’s Meta, at 1.08–slightly better than Google.

That last number is important because it tells you why this efficiency exists. It isn’t a Google-specific secret. Meta got there the same way Google did. Both companies built their data centres to run their own products–Google for Search and YouTube, Meta for Facebook and Instagram. When you own both the hardware and the workload, you can design them as one integrated system. Cooling, power delivery, and server layout are co-optimised for a specific job.

Microsoft and Amazon built for a different purpose: selling cloud infrastructure to thousands of customers with wildly different workloads. That flexibility comes at a thermodynamic cost. Generalised infrastructure is inherently less efficient than purpose-built infrastructure.

So if Meta matches Google on efficiency, why does Google still win this layer?

Because efficiency without distribution is just a cost saving. Meta runs some of the most efficient data centres in the world, and cannot sell a single dollar of that advantage to an external customer. It has no cloud business. The efficiency stays locked inside Facebook and Instagram, unable to compound into a higher-margin revenue stream.

Google runs the same efficient infrastructure and sells it. Every marginal efficiency gain flows through to Google Cloud’s pricing and then to Google’s margins. Stack this on the chip argument from Part 1: Google pays less per chip, then pays less to power each chip. At roughly 50% gross margins for AI–considerably thinner than Search’s 89%, being structurally cheaper at both layers is the difference between a sustainable business and a money-losing one. It is also why Gemini’s API pricing is cheaper than OpenAI’s for comparable models.

No single metric gives Google a decisive moat. But completeness does.

Google designs its own chips. It runs tier-one efficient data centres. It operates a private global network. It builds a frontier model. And it sells all of this as a cloud service.

Now look at everyone else.

Meta matches Google on efficiency and networking but doesn’t sell cloud and its models are open-sourced rather than commercially hosted.
Microsoft sells cloud at enormous scale but doesn’t make its own chips, runs less efficient data centres, and licenses its model from OpenAI without controlling the research roadmap.
Amazon sells cloud and is investing in chips (Trainium) and cables (Fastnet), but both are early, and it has no frontier model.
OpenAI and Anthropic build the models everyone talks about. But they own nothing underneath. No data centres, no chips, no cables. Every token they serve includes a landlord’s tax to whichever hyperscaler hosts them. OpenAI pays Microsoft. Anthropic pays Google and Amazon.

Dario Amodei made this fragility vivid in a recent conversation with Dwarkesh Patel. Dwarkesh pushed him hard: if you really believe we’re a few years from having “a country of geniuses in a data centre,” why aren’t you buying far more compute?

Dario’s answer was striking. He explained that Anthropic’s revenue has been growing roughly 10x per year–from zero to $100 million in 2023, to $1 billion in 2024, to $9–10 billion in 2025. So he sat down and did the math. If that 10x rate continues, revenue would be $100 billion by end of 2026 and $1 trillion by end of 2027. He could, in theory, buy $1 trillion of compute starting in 2027 to keep pace with that demand.

But data centre commitments are made years in advance. If the revenue curve is even slightly off, if growth is 5x a year instead of 10x, or if the “country of geniuses” arrives in mid-2028 instead of mid-2027–Anthropic is stuck paying for compute it can’t monetise. As Dario put it: “if you’re off by only a year, you destroy yourselves.”

This is the existential arithmetic of being a tenant. The numbers confirm it. Anthropic projected roughly 40% gross margins in 2025–sixty cents of every revenue dollar going straight to compute costs. It burned $5.6 billion in cash in 2024, expects to burn another $3 billion in 2025, and plans to spend roughly $80 billion on cloud infrastructure through 2029. Break-even isn’t projected until 2028. The revenue is real. The product is excellent. But the company is running on investor confidence that the revenue curve outpaces the cost curve before the money runs out.

Google’s position is structurally different–not risk-free, but categorically different. It plans to spend $175–185 billion on infrastructure in 2026, nearly double the $91 billion it spent in 2025. If AI demand doesn’t materialise as expected, its margins compress and shareholders are unhappy. What doesn’t happen is bankruptcy.

Alphabet earned over $130 billion in net income in 2025 on $400 billion in revenue. Search, YouTube, and Cloud keep generating cash regardless of whether frontier AI justifies its training costs on schedule. Google can be wrong by a year, or two, and come out bruised but alive. Anthropic cannot.

If scaling laws hold–if more compute continues to produce better models, then the company that trains cheapest trains most. Owning every layer means every efficiency gain compounds with no intermediary taking a cut. There’s no landlord. There’s no margin leakage. That’s a structural advantage that can’t be closed by a funding round.

And right now, only Google has no gaps in it.

abstractions

Discussion about this post

Ready for more?