AI inference will be unbundled
The future of inference is bifurcation: cheap edge hustle at one pole, sovereign-scale sanctuaries at the other
For the past few years, AI has been straightforward: everything flows into the data center cathedral. You buy more GPUs, build bigger data centers and pray to the Nvidia god. The assumption, repeated by analysts, journalists, and investors alike, is that all inference loads lead to a data center.
But physics, law, and money don’t obey theology. They’re f…
