Hugging Face and the Illusion of Infrastructure

Hugging Face is a gallery of dead models. To survive, it must become a factory

May 10, 2025

Hugging Face is a gallery of dead models. To survive, it must become a factory.

Hugging Face bills itself as the GitHub of machine learning. That comparison flatters both parties: GitHub brings structure to the chaos of open-source code, and Hugging Face aspires to do the same for the disjointed zoo of language models, datasets, and machine learning pipelines. But unlike GitHub, which monetizes a foundational need across the software stack, Hugging Face exists in a more fragile and contingent place. It does not own the code, nor the compute, nor the models that define the frontier of AI capability. It is a conduit, not a creator. And that distinction matters more than most people realize.

At first glance, Hugging Face looks like a treasure trove: tens of thousands of language models, sprawling across every use case imaginable. But this abundance is an illusion. Just as the crypto world is littered with thousands of coins no one trades, Hugging Face is bloated with models no one uses. The power law is extreme: a few models like Llama, Mistral, and GPT-2 clones account for nearly all meaningful usage, while the rest serve as digital detritus: dead forks, vanity fine-tunes, or models that never worked in the first place. One Llama variant with elevent downloads sits next to Mistral with millions. The UX treats them as equals. That’s not openness; it’s entropy.

So what, then, is the value? Why does Hugging Face exist? The answer is deceptively simple: optionality. Hosting every model under the sun makes Hugging Face the default namespace for open AI. If a model exists, chances are it lives on Hugging Face. That optionality is not worthless. It creates a surface area for innovation, remixing, and serendipity. But it is far from a business model.

Right now, Hugging Face is burn-heavy. It's a high-traffic, low-monetization platform subsidized by venture capital and driven by developer goodwill. Free users consume bandwidth and GPU cycles without paying for them. Enterprises poke around but are slow to commit. Like GitHub in its early days, or Reddit for most of its history, Hugging Face sits atop an ocean of usage with very little monetized throughput. The problem isn't traffic; it's capture.

The obvious path forward is to become infrastructure. Not physical infrastructure, but software infrastructure: standardized APIs, programmable access, compliance hooks, and observability layers that let real businesses run real workloads. Think Stripe for LLMs. Think Snowflake for AI pipelines. This is the only defensible long-term path.

To get there, Hugging Face needs to pivot from being a library to being a runtime. Right now, most developers treat it like an archive: a place to browse models, download weights, and tinker. But a runtime mindset means building, deploying, and serving production-grade AI applications directly from within the Hugging Face ecosystem. It means offering guarantees: uptime, latency, performance, cost predictability. It means turning usage into throughput, not just traffic.

This pivot is structurally difficult. Hugging Face lacks proprietary IP. It has not trained any foundation model of note since BLOOM, and that effort was more symbolic than strategic. Without a vertically integrated model stack, Hugging Face depends entirely on others for core capability. This makes them fragile: if Meta, Mistral, or OpenRouter decide to host their own endpoints or build better APIs, Hugging Face becomes a middleman who can be disintermediated at any time.

It gets worse. The cloud hyperscalers are circling. AWS, Azure, and GCP all offer their own LLM platforms, increasingly bundled with model registries, inference endpoints, fine-tuning workflows, and enterprise governance layers. Hugging Face may partner with these providers today, but in the long run it risks being swallowed by them. If you're a Fortune 500 CIO already embedded in AWS, why would you trust your LLM stack to a thin layer of Python wrappers?

Then there is the branding paradox. Hugging Face is beloved by the open-source community precisely because it is open, chaotic, and free. But enterprise buyers don't want chaos. They want SLAs, audit logs, reproducibility, and compliance. They want control planes, not playgrounds. The GitHub comparison breaks down here. GitHub succeeded not just by hosting code but by embedding itself into CI/CD pipelines, IDEs, and permission hierarchies. Hugging Face hasn’t crossed that Rubicon.

To make the leap, Hugging Face needs to own more of the development loop. Today, developers build elsewhere and come to Hugging Face to publish. Tomorrow, Hugging Face must become the place where you train, tune, deploy, and monitor your models end-to-end. That means native agent frameworks, first-class support for RAG architectures, and a deeply integrated CI/CD pipeline for model workflows. It means real-time evaluation tooling, live inference dashboards, and version-controlled APIs for app deployment.

It also means productizing enterprise infrastructure. Not just inference endpoints, but persistent model APIs with governance layers: permissioning, observability, rollback, and drift detection. Hugging Face should own the ML equivalent of Terraform and Datadog combined: a programmable, monitorable deployment surface that enterprises can trust.

More radically, Hugging Face should consider training its own LLM. This would not be for capability leadership, but for integration control. A reference model deeply optimized for Hugging Face's toolchain would anchor the ecosystem and reduce reliance on external labs. This is what Rust is to Mozilla, or what Vercel did with Next.js. Own the reference implementation, and you own the tooling decisions that cascade from it.

Of course, Hugging Face will never own the physical infrastructure. It will always sit above the layer of steel, silicon, and power. But that doesn’t preclude control. Cloudflare doesn’t own fiber, yet it controls traffic. Stripe doesn’t own banks, yet it mediates transactions. Hugging Face can become the orchestration and governance layer for applied machine learning. But only if it chooses to.

Because remaining a platform of zombie models and free-tier usage is a slow death. The only path forward is to operationalize. Hugging Face has to become the operating layer for enterprise AI. Not the storage layer. Not the archive. The runtime.

This transition is existential. If it fails, Hugging Face will go the way of SourceForge: a once-beloved host of open artifacts, slowly abandoned as serious users migrate to better-integrated, professionally managed alternatives. If it succeeds, it becomes the Docker, the Stripe, or the GitHub of AI. But only if it earns it.

Infrastructure, in the real world, means cement, steel, copper, and labor. In the digital world, it means repeatability, reliability, and integration. Right now, Hugging Face is a gallery. To survive, it needs to become a factory.

Buy the Rumor; Sell the News

Discussion about this post