The AI Pullback That Didn’t Happen
Enterprise AI spend is being rationalized, not eliminated
Fortune, Tom’s Hardware, and CIO have all run versions of the same story in recent months: companies are pulling back on AI because ROI isn’t materializing. There’s data behind that claim. At the same time, people are conflating cancellation of AI pilots with reducing bets on AI.
“AI spending” is not one market. The relevant buckets are hyperscaler infrastructure capex, enterprise application and software spend, experimental pilot budgets, and metered token consumption inside production workflows. All four are behaving differently. Low-quality pilots are being killed. Production deployments are concentrating around fewer vendors. Metered token consumption is outrunning enterprise budgeting models. Hyperscaler infrastructure capex remains supply-constrained. Treating those as a single demand signal called “AI spending” is where the analysis goes wrong.
At the top of the stack: supply-constrained acceleration
Hyperscaler capital expenditure has grown at a 72% annualized rate since GPT-4, with 2026 estimates from various sources running $70-$770 billion across the big five–Amazon, Alphabet, Meta, Microsoft, and Oracle1–up from $443 billion in 2025. AWS’s contracted backlog stands at $244 billion, up 40% year-over-year; Google’s is $240 billion; Microsoft is sitting on $80 billion of unfilled Azure orders that can’t ship because the GPUs are waiting for power infrastructure. Every major hyperscaler reports capacity being absorbed as fast as it can be physically deployed. The constraint is supply not demand.
At the enterprise tier: high failure rates, concentrated reallocation
This is where the pullback narrative has actual numbers behind it. McKinsey’s latest State of AI research shows broad adoption but limited enterprise-level EBIT impact, with most firms still in experimentation or pilot mode rather than scaling. S&P Global found 42% of companies abandoned most of their AI projects in 2025, up from 17% in 2024. Forrester expects about half of its financial services and healthcare client base to defer planned AI outlays this year, as CFOs demand financial rigor, and formally predicted enterprises will defer 25% of planned AI spend into 2027.
But “defer” is not “cut.” Gartner projects enterprise AI applications spending to nearly triple to $270 billion in 2026, even as individual pilots get killed. Pilots are being canceled; production deployments are scaling. The experimentation budget is being reallocated toward a smaller set of proven vendors. A vendor who sold vision and gets paid to run pilots is in trouble. A vendor embedded in production deployment is not.
The Uber case: a pricing model problem, not an ROI problem
Uber burned through its entire 2026 AI budget in four months. Claude Code and Cursor adoption went from 32% of engineers in February to 84% in March to 95% using AI tools monthly by April. Per-engineer token bulls ran $150-250 a month at normal usage; heavy users hit $500-2,000; the CTO personally burned $1,200 in a two-hour demo. The budget didn’t overrun because the tools failed. It overran because 5,000 engineers found the technology to be genuinely useful, and used it accordingly.
COO Andrew Macdonald’s public response: “Maybe implicitly there’s more that is getting shipped, but it’s very hard to draw a line between one of those stats and ‘Okay now we’re actually producing like 25% more useful consumer features.’” He coined the term “tokenmaxxing” for the dynamic: employees consuming tokens at a high rate because the tools work, because an internal leaderboard was ranking teams by AI usage, because the marginal cost feels low until the budget is gone. Seventy percent of Uber’s committed code is now AI-generated. The adoption is real, but the attribution to business outcomes is not.
This is filed as AI skepticism, but it shouldn’t be. Enterprises budgeted AI like a SaaS seat: monthly fee, predictable consumption, standard ROI horizon. What they deployed, though, was a metered consumption good with demand elasticity that scales non-linearly once it hits daily engineering workflows. Box CEO Aaron Levie flagged the obvious extension: this dynamic starts in engineering and then hits legal, sales, and the rest of knowledge work. Uber’s budget overrun was caused by utility, and the complete absence of any existing framework for forecasting agentic token consumption.
The labor case: trading humans for AI
Capital is being reallocated from human labor to compute. Meta recently cut 8,000 jobs while raising 2026 capex guidance to between $115 and $135 billion. Amazon cut 16,000 corporate workers the same quarter it committed to $185-200 billion in AI infrastructure. In some cases that’s direct substitution; in others it’s margin protection or capex funding. But the corporate trade is visible, and the companies making it are buying more GPU capacity than anyone in history. Calling this a pullback is a category error.
What this all means
Each of the four buckets implies something different for the compute demand curve. If the read is “enterprise ROI is bad so AI is declining,” inference demand looks structurally impaired and neocloud revenue assumptions are built on sand. If the read is “pilots are dying but production deployments are scaling and token consumption is structurally outrunning every enterprise budget model,” the inference curve is volatile, but on an upward trajectory. What this suggests is that AI is a metered consumption good that ought to be priced like a utility, not modeled on seat count SaaS economics.
At the enterprise level, Jevons paradox is running in real time. Cheaper, more capable tools get consumed at higher volumes, not lower ones, compound by workflow embedding, internal usage incentives, and agentic multi-step consumption that burns through tokens at orders of magnitude above what any SaaS model would project.
The real constraint here is measurement, not demand.
If you enjoy this newsletter, consider sharing it with a colleague.
I’m always happy to receive comments, questions, and pushback. If you want to connect with me directly, you can:
follow me on Twitter,
connect with me on LinkedIn, or
send an email to dave [at] davefriedman dot co. (Not .com!)
Some will object to Oracle being included among the hyperscalers; the source link, from Epoch AI, includes it as one of the hyperscalers, and I am following their convention.

So, let's say $200k/year per job. $3.2 billion a year in payroll, and spending $185 billion, or 60 years' worth of payroll on compute. Big expectations there.
Attribution is the missing layer in the AI infrastructure stack. Companies can track GPU utilization and token volumes, but they cannot yet connect compute spend to business output. Until that link exists, pricing and allocation are approximations. The capex return problem is fundamentally an attribution problem.