Welcome to the latest edition of Buy the Rumor; Sell the News. In today’s post, I dissect venture capitalists’ claims about capital efficient AI startups.
If you like what you read here, consider subscribing, and if you’re already subscribed, consider upgrading to a paid subscription. If you want to connect with me directly, my contact information is at the end of this post.
There’s a new religion in venture capital: the gospel of “capital-efficient” AI.
Spend five minutes on LinkedIn or Twitter and you’ll see it chanted like liturgy: tiny teams, single-digit headcount, eight-figure run rate. “Look! Five engineers, $3 million ARR, Series A at 30× revenue! Hallelujah!”
Sure. And I’m an Olympic gymnast.
What capital efficiency actually means
Let’s ground this. Capital efficiency isn’t just small teams and fast ARR. It’s about:
Minimal outside capital to reach escape velocity.
High enterprise value per dollar raised.
Incremental growth funded by internal cash flow, not dilution.
Classic SaaS nailed it. You’d raise a modest seed, build once, replicate infinitely at near-zero marginal cost, with 85–90% gross margins. Each dollar reinvested bought more growth, compounding relentlessly.
Why VCs think AI looks the same
Because on the surface, it does:
Small teams bolt APIs from OpenAI, Anthropic, or Mistral onto a React front-end.
Ship a product in weeks, get to market in months.
Land big contracts with a headcount you can count on two hands.
It smells like WhatsApp or Instagram: minimal staff, eye-popping outcomes. Silicon Valley lives for these stories. Who doesn’t want another 13-person, $1 billion exit?
Why the economics unravel immediately
1. Marginal costs scale with usage
Unlike SaaS, where your marginal cost to serve the next customer is near zero, AI burns cash for every query.
Typical usage: A single user session might run 10,000 tokens. At $0.03 per 1K tokens for GPT-4, that’s $0.03 per session in direct API cost.
Gross margin reality: Selling a $50/month seat? That margin is no longer 90%. Factor in inference costs, retrieval latency tricks, and monitoring overhead, and gross margins routinely drop to 60–70%.
This is before paying engineers, running your own tuning jobs, or handling support. Here’s OpenAI’s data:
2. The Jevons sucker punch
“But costs will fall!” you say. Sure. But cheaper inference triggers elastic demand. As tokens get cheaper, products embed more LLM calls, users query more, agents spawn sub-agents. And why do products embed more LLM calls? Because your startup competes with a dozen other startups operating in the same niche on top of the model.
Jevons paradox 101: Efficiency doesn’t drop costs; it grows consumption until your aggregate GPU bill goes up, not down.
3. You don’t own the asset. So you don’t control the economics.
In SaaS, your code was the moat. In pharma, it’s the patent. In oil, it’s the well. In AI? The entire cost structure is owned by parties other than the startup.
OpenAI can raise prices tomorrow. Nvidia just jacked H100 costs again. ERCOT’s capacity constraints mean your new GPU farm pays a premium for power, if you can get it at all.
This is why I keep saying: AI isn’t SaaS. It’s an industrial system masquerading as software. Running an AI startup with five engineers is like operating a steel mill with four people. Your labor costs look amazing, until you see your P&L hostage to power prices and commodity futures.
4. “But our team is tiny!”
Operationally lean ≠ capital efficient if your working capital is tied up in renting someone else’s refinery.
Early-stage VCs love the capital efficiency claim because it papers over ugly truths:
Write small checks to fund quick launches.
Get to $2–3M ARR with 5 people.
Mark up at 10× for a notional $30M post-money.
But the minute these businesses scale, the real costs explode straight into Nvidia’s and AWS’s pockets. Jeff Bezos reportedly said of the incumbents Amazon displaced your margin is my opportunity. And here we see it play out in real time.
5. Pre-empting the usual VC counterarguments
“Inference costs are dropping, margins will expand.”
Not really. Model footprints balloon faster (GPT-2 → GPT-4 is a ~500× jump in training flops), and context windows keep growing. Effective cost per useful output isn’t plummeting.
“We’ll fine-tune small models!”
Narrow domains work, until your customers demand general reasoning, multiple languages, multimodal support. Now you’re back to square one, paying for foundational models. Remember your startup competes with a dozen other startups making the same API calls to the same underlying model. You have no choice but to incorporate new features and incur more expenses to retain customers.
“We’re funded by cloud credits!”
Congrats. That’s a temporary subsidy, not a business model. The cliff when credits expire is brutal.
6. Who is actually capital efficient in AI?
Very few. The winners own the asset:
Exclusive data that lowers training costs or improves output quality. (But that’s rare.)
Vertical integration into silicon and inference. Why do you think OpenAI is rumored to be exploring custom chips? Why is Amazon obsessed with Trainium?
Picks and shovels. Orchestration software that helps enterprises optimize model usage: basically taking a rake on everyone else’s GPU addiction.
Everyone else? They’re drop-shipping compute at retail and hoping their UI masks it.
7. The bottom line
AI isn’t SaaS. It’s capex-heavy infrastructure with a slick UX.
Real capital efficiency means owning the mines, the pipelines, and the power plants: the data, the models, the silicon, the energy. Everyone else is just a high-margin story with low-margin guts.
So next time a VC tells you about the capital efficiency of five engineers and a $3M run rate, ask them who owns the GPUs.
When the tide goes out, it’s Nvidia, not your startup, that’s still wearing trunks.
Coda
If you enjoy this newsletter, consider sharing it with a colleague.
Most posts are public. Some are paywalled.
I’m always happy to receive comments, questions, and pushback. If you want to connect with me directly, you can:
VCs zoom in in 2 years of fast growth and apply a 30x ARR. Pipelines, wires and factories are not sexy to the average investor. The sleek UX still gets all the attention.
One surprise for me in the last 12 months is how much value capture is happening at the model layer. I would have thought that the models would behave like near-perfect substitutes and that competition and open source (Meta) would push inference prices to near zero. But it looks increasingly like OpenAI will end up being a valuable toll booth.