When the Curve Bends: AI’s Infrastructure Reckoning
Diminishing returns in model scaling threaten to strand GPUs, megawatts, and capital
Over the last 18 months, the story of AI has been told as a straight line: bigger models, more compute, more power. But that curve may already be bending. And if it is, the first place you’ll see the effects isn’t on a leaderboard or a benchmark. It’s in the electrical grid.
Large-scale AI systems, particularly transformer-based large language models (LLMs), are extraordinarily compute- and energy-intensive. But recent evidence suggests that the marginal gains from scaling these models are tapering off1. Despite 10x increases in compute for newer frontier models, their real-world utility—reasoning, reliability, tool integration—has improved only incrementally. Put more concretely, GPT 4 is not 10x better than GPT3.
Formally, we can think of total power demand from AI as:
In other words, power consumption is the thermodynamic shadow of model development and usage. When training and inference plateau, power demand should too. And that’s exactly what early signals suggest is happening.
The First Signs of Overbuild
Grid operators, GPU suppliers, and hyperscalers are still behaving as if the demand curve is exponential. But the actual curve, if you track model capability, user adoption, and marginal utility per token, looks increasingly sublinear.
Consider the evidence:
ERCOT, the Texas grid operator, initially projected AI data centers would drive electricity demand as high as 218 GW by 2031. That forecast has now been revised downward to 145 GW, factoring in speculative interconnection requests that will likely never materialize.
A significant percentage of proposed data centers are not being built. Utilities report interconnection requests outpacing actual deployment by a factor of 5 to 10.
Meta’s $10B Louisiana AI campus has stalled amid scrutiny over its reliance on gas-fired generation.
Nearly half of a $5B gas plant initiative in Texas has been canceled due to uncertain AI load realization.
If AI’s demand for compute and inference flattens, much of the power infrastructure being constructed around it becomes misaligned with value.
From Dark Fiber to Dark FLOPs
This scenario echoes the early 2000s “dark fiber” phenomenon. Telecoms, anticipating explosive internet growth, laid massive fiber-optic networks. Much of that capacity sat idle for years.
Today’s version is “dark FLOPs”: overbuilt compute clusters and stranded power capacity created in anticipation of an LLM scaling curve that is now flattening.
It’s not that AI is failing. It’s that the incremental value of scale is shrinking.
If GPT-4-class models already saturate most economic use cases, and GPT-5 isn’t radically better, then:
There’s less incentive to retrain at frontier scale.
Token usage may grow linearly or flatten, not exponentially.
Inference optimization (e.g. RAG, memory modules, small specialist models) becomes more important than sheer horsepower.
And that means: less power. Less cooling. Fewer GPU orders.
What About Jevons Paradox?
Classical energy economics tells us that improved efficiency usually increases total resource consumption. This is Jevons Paradox: as costs fall, demand explodes. Shouldn’t AI follow the same curve?
Perhaps, but only if demand is elastic and use cases proliferate faster than returns diminish. So far, that’s not what we’re seeing.
The marginal utility of scaling, whether measured in enterprise productivity, consumer adoption, or model capability, is already showing signs of tapering. And many downstream deployments (e.g. agents, copilots) are bottlenecked not by cost per FLOP, but by trust, integration, and verification.
Efficiency is improving, but demand isn’t keeping up. This isn’t Jevons. It’s saturation.
Nvidia: The Canary in the Thermal Mine
If this thesis is correct, one of the first public signs will be softening GPU demand at Nvidia. Here’s what to watch:
Direct signals:
Sequential decline in Nvidia's data center revenue or muted forward guidance.
Inventory build-up at Nvidia’s distribution and integration partners, such as distributors like Arrow and Avnet, or system assemblers like Foxconn.
Reduced advance payments or preorder volumes from hyperscalers.
Indirect signals:
Hyperscaler CapEx guidance flattens—especially from Microsoft, Google, and Amazon.
Infra startups like CoreWeave or Lambda slow down new data center buildouts or report underutilization.
Secondary GPU market activity increases—H100s and A100s hitting resale or lease-back channels.
Narrative-level signals:
Softer tone from Nvidia’s executive team—less emphasis on “limitless demand.”
Analyst questions around utilization and demand elasticity.
Gross margin compression, especially if pricing holds but volume stalls.
Any combination of these would signal that the real constraint is no longer supply—it’s demand.
The Strategic Reorientation
If AI capability is flattening, but infrastructure is still expanding, we have a classical overshoot problem. The capital flows, land purchases, and power deals inked in 2023–2024 may outstrip real-world AI needs in 2025–2026.
Here’s how the smartest capital will reposition:
We’ve already seen infrastructure players like CoreWeave, Lambda, and Crusoe gear up for speculative scale. But if model returns diminish, many will be forced to pivot, from speculative training scale to structured, vertically integrated inference businesses.
Why Power Tells the Truth
If you want to understand where AI is really going, ignore the hype cycles and product demos. Watch the grid.
Power demand is the first derivative of training and inference demand. It’s a lagging indicator of model value. If power forecasts, buildouts, and GPU utilization rates flatten or fall, it’s not because AI died. It’s because we hit the slope of diminishing returns.
The Future: Not Collapse, But Constraint
This doesn’t mean AI is a bubble. It means the path forward might no longer be brute-force scaling. If that is true, then architectural innovation, orchestration, tool use, and human-machine integration will become paramount.
If this thesis bears out, then, while 2023 was about training, 2025–2027 will be about deployment.
And as always in industrial transitions: when the capability curve flattens, the infrastructure curve overshoots.
Credit for this observation goes to Rohit Krishnan, articulated in this podcast with Anna Gát.