Jensen Huang on Anthropic, OpenAI, China, and demand for inference tokens
Dwarkesh Patel recently interviewed Jensen Huang; let's take a look at how the purveyor of AI picks and shovels sees the AI world
This post is a summary of, and commentary on, Dwarkesh Patel’s recent interview with Nvidia cofounder and CEO Jensen Huang.
Dwarkesh Patel sat down with Jensen Huang for ninety minutes recently. Most of the conversation was Jensen on autopilot: five-layer cake, electrons-to-tokens, install base, ecosystem, flywheel. The familiar liturgy. But there were three exchanges where Jensen said more than he probably intended, and one where he visibly lost his composure. Each is worth reading carefully because each maps onto a question that matters for how capital is being deployed in this cycle.
The supply chain moat is informational, not contractual
Jensen put the headline number in plain view: roughly $250b in upstream purchase commitments per SemiAnalysis estimates, with the explicit commitments being only part of the story. The implicit part, which should interest anyone underwriting GPU collateral, is what he described as “informing, inspiring, and aligning with CEOs of all different industries upstream.” SK Hynix, Micron, TSMC, Lumentum, and Coherent are making capacity investments because Jensen has personally walked them through the demand picture, and they trust Nvidia’s downstream offtake more than they trust anyone else’s. This is supply chain capture by relationship, not by contract. It is also the single best argument for why competing accelerator programs hit a ceiling that has nothing to do with engineering: the upstream supply simply will not underwrite a second curve at scale until they see Nvidia-equivalent downstream demand, and they will not see that until they do, which they will not.
The CoWoS anecdote is the worked example. Two years of “swarming” turned advanced packaging from a specialty constraint into a mainstream technology, and TSMC now scales packaging in lockstep with logic. Nvidia got the supply chain to internalize its roadmap. The next supply chain risks Jense is prefetching are silicon photonics, double-sided probing, and EUV throughput at ASML.
“Anthropic is a unique instance, not a trend”
Dwarkesh pressed on obvious counterexamples, including Anthropic using TPUs and Trainium chips, OpenAI’s AMD deal, and the rumored Titan accelerator, and Jensen’s response collapsed the entire non-Nvidia training story to a single name. “Without Anthropic, why would there be any TPU growth at all? It’s 100% Anthropic.”
He is largely right. The vertical integration thesis, which is that frontier labs will inevitably bring silicon in-house, collapses to one data point when you look at who is actually moving meaningful workloads off Nvidia. (We will have to revisit this claim if Musk’s Terafab comes to fruition.) And Jensen’s explanation for why even that data point exists is more candid than I expected: Nvidia could not make the multi-billion dollar equity investment that Anthropic needed to underwrite its compute commitment, AWS and Google could, and so the offtake followed the capital. He framed this as his “miss.” It is also the playbook he is now running explicitly with OpenAI (~$30B reported), Anthropic (~$10B), and the neocloud cohort.
This is the argument worth carrying into the structured credit side of the book. The lab-grade compute deals do not get done on a total cost of ownership-basis. They get done on equity-and-offtake bundles where the chip vendor is functionally the lender of last resort. Anyone modeling Nvidia share against a “labs defect to ASIC” thesis is missing the financing channel that drove the only meaningful defection so far.
The financialization layer is deliberate
The most useful answer for anyone in the GPU credit ecosystem came when Dwarkesh asked the obvious question: Nvidia has the cash, why not become a hypescaler? Jensen’s answer was a doctrine: “Do as much as needed, as little as possible.” He named CoreWeave, Nscale, and Nebius explicitly and said none of them would exist in their current form without Nvidia’s support, but he was equally clear that the support stops short of disintermediating them. “Do we want to be in the financing business? The answer is no.”
Read that against Nvidia’s assorted investments: the CoreWeave backdrop ($6.3B reported), the $2B equity check, the OpenAI investment, and the warrants and side letters that nobody is publicly enumerating. Nvidia is not staying out of finance. It is staying out of cloud P&L. The financialization layer, meaning the CapEx-to-OpEx conversion that the neoclouds and the structured credit market are getting paid to perform, exists because Nvidia chose to let it exist. It is downstream of a deliberate choice to keep the business model as simple as possible while letting the duration mismatch sit on someone else’s balance sheet. The neoclouds are not capturing residual rents Nvidia missed. They are absorbing risk Nvidia priced out.
China: the only place Jensen loses composure
The China section runs forty minutes and is the only stretch where Jensen visibly drops the keynote voice. Dwarkesh’s argument, that selling H20-class compute into China shortens the timeline on offensive cyber capability and that the marginal flop matters, got back the most agitated Jensen on record. Telecom analogies, accusations of “loser premise” and “childish absolutes,” repeated insistence that 50% of AI researchers are Chinese, that 60% of mainstream chip manufacturing is Chinese, and that Huawei just had a record year.
Strip away the irritation and the economic argument is coherent. If you concede the second largest compute market, you concede the developer ecosystem that compounds the CUDA moat, and you accelerate the emergence of a non-American stack that diffuses through the global south as the open-weight default. The strategic question is whether you trade short-term capability denial for long-term standards capture. Jensen’s view is that the denial does not actually deny, while the concession does in fact concede. In other words, China has the energy, the manufacturing, the researchers, and the algorithmic ingenuity to compensate at 7nm chips. Why let them use homegrown technology when they could be yoked to American tech?
I am not sure he is wrong. I am certain that the absolutism of the export control discourse on both sides is generating worse policy than either side would design if forced to think probabilistically.
Other miscellaneous observations
Jensen confirmed that Nvidia segments the inference Pareto frontier, meaning it accepts lower throughput in exchange for premium-ASP, low-latency tokens, with Groq folded into CUDA as the vehicle. The justification was telling: software engineers are valuable enough that they will pay materially more for faster tokens, and the inference market is finally rich enough to support tiered pricing rather than throughput maximization on a single curve.
Anyone modeling token economics, or building a thesis on aggregate inference revenue per watt, on a single curve is about to get the wrong answer. The shape of the inference market increasingly looks like the shape of the bandwidth market in the late 1990s: same physical commodity, multiple service tiers, very different unit economics depending on which tier you sell into. That is the corner of the market that interesting collateral and derivatives arise from.
If you enjoy this newsletter, consider sharing it with a colleague.
I’m always happy to receive comments, questions, and pushback. If you want to connect with me directly, you can:

What is Pareto referring to?