AI Depends on More Than Software

The compute-energy stack determines who captures value in the AI economy

Aug 17, 2025

∙ Paid

Two weeks ago, I wrote about the shift from petrostates to electrostates, or how electricity is becoming the new foundation of geopolitical strength, replacing oil as the currency of power. That essay traced how China has systematically built an electricity-based industrial system whilst the West, most notably America, remains caught between old assumptions and new realities.

This follow-up examines the implications for artificial intelligence (AI) through a different lens: not as a software revolution, but as an infrastructure challenge.

Let me put it that way: OpenAI raised $6.6 billion for model development; Microsoft bought a nuclear plant. One of these strategies will define the next decade. Whilst everyone obsesses over GPT-5 versus Claude Opus 4.1, the real game is being played in power plants and transmission lines.

The central insight is that AI's performance and economics are inseparable from the physical system supporting it—what I’ve decided to call the compute-energy stack. This stack has four layers: the energy grid (generation, transmission, storage), data centres (servers, chips, cooling), AI models (training, inference, optimisation), and integration (embedding AI into products and workflows).

The question is which layer will set the pace, capture the highest margins, and shape innovation. Capital flows illustrate the stakes: global spending on AI infrastructure already reaches tens of billions annually, split amongst technology firms, utilities, and infrastructure investors. Control over the stack implies control over returns.

As always, history provides crucial context to understand where this is going. During electrification from 1880 to 1930, industrial output rose significantly only after grids, generators, and industrial wiring were widely deployed. The invention of the dynamo alone delivered limited economic benefit. AI may follow the same pattern: infrastructure determines value capture, not algorithms or models.

My view is that few people understand this. Today's AI leaders behave like software evangelists when they should study infrastructure masters. As I'll discuss later in this essay, the real models for AI dominance are not Silicon Valley's startup wizards, but the infrastructure builders who created the monopolies and networks that defined earlier technological eras. These masters—from telecommunications to utilities to cable to e-commerce—understood what today's AI leaders miss: in capital-intensive networked businesses, infrastructure dominance beats technical superiority.

This, I believe, is key to understanding how AI will redistribute wealth and power. If you want to learn more, read on 👇

1/ Scale economics punish late movers

To understand why infrastructure beats algorithms, we need to examine each layer of the compute-energy stack. Each follows different economic rules, creating distinct bottlenecks and opportunities. Think of it like building a factory: you need land, buildings, machines, and workers. Each component has its own constraints and costs. Miss one, and the entire operation stalls.

The energy grid traditionally follows what economists call “lumpy” investment patterns. If you're an industrialist considering a new aluminium smelter, you need certainty that electricity demand will follow, because you can't build half a power plant. Each new facility costs more than the last: land near transmission lines becomes scarcer, environmental permits harder, coordination with existing infrastructure more complex. The experience curve helps, but it rarely offsets rising complexity costs.

Note that renewables are changing this calculus. Unlike the traditional, top-down grid, solar farms grow one panel at a time, wind farms one turbine at a time. Suddenly, energy infrastructure starts behaving more like software: modular, scalable, with declining marginal costs.

China knows this: They are building solar farms and turning the grid into a platform where millions of distributed assets respond to price signals in real time. Or, to put it more bluntly, they add the equivalent of the entire US electricity grid every four years, 86% of it renewable. In other words, each passing year makes the Chinese grid more like software in terms of scalability, returns, and unit economics.

Compute infrastructure faces similar challenges. You can't buy half a data centre or three-quarters of a cooling system. Microsoft's recent deals illustrate the bind: they're paying double market rates for large units of nuclear power because the alternative is… no power at all. In addition, each new facility requires not just servers but transformers, backup generators, cooling towers, fibre connections. The complexity compounds with each site as coordination demands multiply.

AI models, on the other hand, should theoretically enjoy increasing returns: more data means better models, better models attract more users, more users generate more data. The virtuous cycle that built great companies such as Google and Facebook.

But there's a dark secret emerging: AI-generated content is poisoning the training data. It's like what happened to Twitter: early adopters enjoyed high-quality discussions with fascinating people; now it's polluted with bots, spam, and rage-bait. Models trained on synthetic data show the same degradation. Claude's writing assistance has deteriorated (trust me, I know); ChatGPT increasingly produces generic, hallucinated content. AI is already experiencing what Sangeet Paul Choudary once called “reverse network effects”: the more users your application has, the less value it creates for each user.
And as soon as 2020, Martin Casado and Matt Bornstein of Andreessen Horowitz explained that AI displayed much less increasing returns to scale than software. Their analysis showed AI companies combining elements of both software and services, with gross margins often in the 50-60% range rather than the 60-80% typical of SaaS businesses. Heavy cloud infrastructure costs, ongoing human support requirements, and persistent edge cases meant AI companies faced substantial variable costs that didn't disappear with scale. Unlike traditional software's “build once, sell many times” model, AI systems required continuous model retraining, customer-specific fine-tuning, and human oversight that made them resemble services businesses as much as software companies.

Distribution and integration appears frictionless but hides massive adoption costs. Convincing someone to try ChatGPT takes seconds—which explains the spectacular adoption speed that everyone is marveling about. But integrating AI into a bank's loan approval process takes years. The constraint isn't technology but trust, regulation, and organisational inertia. Every enterprise software company knows this: the product is 10% of the battle; deployment is everything else.

All these layers interact in unexpected ways. Cheap renewable energy should make AI training costs plummet… but only if you can connect to the grid. Better models should dominate the market… but not if they're trained on degraded data. The stack is more than infrastructure; it is a complex system in which the economics of each layer shape the others. Miss this, and you'll build the AI equivalent of WeWork: beautiful spaces that haemorrhage cash because the unit economics never worked.

WeWork’s (Botched?) IPO

Nicolas Colin

September 11, 2019

Read full story

2/ Electricity pricing determines competitive advantage

The economics of the compute-energy stack are stark and unambiguous: every token generated, every model trained, every inference executed translates kilowatt-hours into intelligence. At $0.10 per kWh, training a frontier model consumes roughly $5 million in electricity alone; at $0.02 per kWh, the same exercise costs $1 million. This 80% variance in marginal cost dictates the capacity to experiment, iterate, and ultimately capture market dominance. In other words, marginal electricity cost, rather than model architecture, increasingly shapes the competitive landscape.

Power pricing alone, however, is insufficient. Microsoft’s contract for Three Mile Island, priced at $100 per megawatt-hour for 837 megawatts, is roughly double the US grid average. What the steep premium secures is certainty: a single outage during training can waste weeks of computation and millions of dollars in sunk costs. Reliability, therefore, matters as much as price. This explains why data centres cluster in regions offering both stable and relatively inexpensive electricity rather than simply chasing the lowest nominal rate.

As a result, geography is responding to electricity economics in a direct, measurable way. Iceland’s aluminium smelters offered the initial proof: locating production where power is cheapest, regardless of other disadvantages, yields economic advantage. AI amplifies this logic. Training occurs in Quebec during spring melt, when hydro prices turn negative, while inference shifts to Nevada’s solar-geothermal network. Traditional tech hubs such as Silicon Valley, Seattle, or Boston lose relevance if they cannot deliver power at competitive cost.

Grid constraints further complicate the calculus. Northern Virginia, despite hosting significant data centre capacity, faces three-to-seven-year grid interconnection delays, limiting the ability to bring new compute online. Texas, while rich in wind generation, experiences extreme intra-day price swings, from negative $25 to positive $9,000 per MWh, reflecting isolation from broader grid infrastructure. California curtails midday solar production even as it imports power in the evening.

These structural mismatches mean that theoretical electricity abundance often fails to meet actual compute demand, creating a critical friction point for scaling AI operations.

A software-defined, programmable grid offers a pathway around these constraints, with China at the forefront. Real-time pricing updates, electric vehicles as distributed storage, and (as is of interest in this edition) dynamic AI workload scheduling convert compute from a grid burden into a balancing asset. Training pauses when industrial demand spikes, resumes during peak renewable output, and thereby transforms electricity from a static cost into a variable one, arbitrageable across time and geography. Marginal cost is no longer simply a function of price per kilowatt-hour; it reflects the operator’s ability to exploit temporal and spatial price differentials.

This gives rise to locational and temporal arbitrage as a core competency. Google, for example, trains models in Iowa’s wind corridor and executes inference on Oregon’s hydro network. Soon there will be firms dynamically reallocating workloads across regions to capture transient price advantages. Those capable of mastering this flexibility will achieve significant cost advantages over operators locked into single sites. Electricity pricing, rather than incremental algorithmic efficiency, increasingly defines competitive positioning.

In turn, the implications cascade throughout the compute-energy stack. Chip designers focus on joules per operation, data centre operators become de facto power traders, and model architects must weigh performance against electricity consumption. The industry’s gravitational centre shifts from purely software expertise toward infrastructure operations. Success is measured not by the theoretical power of a model, but by the ability to deploy it profitably at scale. In this environment, electricity management becomes a strategic lever on par with talent, capital, and data.

From Petrostates to Electrostates

Nicolas Colin

Aug 3