Mentatcurated
Artificial Intelligence medium · independent

The Jevons bill comes due

AI inference keeps getting cheaper per token, yet the bills are climbing fast — because every price cut invites buyers to spend the savings on more tokens, and then some.

In late March, Gartner put numbers on a paradox the industry had been circling for a year. Running a question through a trillion-parameter model, it forecasts, will cost providers more than 90% less by 2030 than it does today. And enterprises will not save a cent. The reason is that the newest AI agents do not answer a question and stop; they plan, call tools, check their own work, and retry — burning five to thirty times more tokens per task than a chatbot. Unit cost falls; total consumption rises faster.

Agentic AI uses 5 to 30 times more tokens per task than a chatbot — so even a 90% cheaper token leaves the bill higher.

This is the Jevons paradox, the 1865 observation that more efficient steam engines burned more coal, not less, because cheap power created appetite for it. a16z named the AI version outright in a chart this spring — 'Textbook Jevons,' price down, demand up — as the routing service OpenRouter reported moving tens of trillions of tokens, roughly five times the volume of six months earlier.

What makes it sharp is the week the two halves collided. While venture capitalists framed collapsing prices as an abundance story, buyers were in a panic: TechCrunch reported firms three times over their entire 2026 token budget by April, and one company hit with a single-month bill of half a billion dollars after setting no limits. Cheaper intelligence is not producing cheaper AI. It is producing more AI, at a higher total price — and the line everyone watches understates it, because Dave Blundin notes the demand curve is artificially flat where capacity is simply sold out.

The caveat worth keeping is that the steepest cost declines are recent. Epoch AI, which tracks inference prices independently, finds the drops vary wildly by task and that the fastest ones may not hold. Whether the savings keep arriving fast enough to feed the appetite is the open question under every enterprise budget for the rest of the decade.

The lenses

Novelty 2
Impact · breadth 4
Impact · depth 3
Actionable 1
Substance 3
Hype 4

The facts

Gartner forecastInference on a trillion-parameter model to cost providers 90%+ less by 2030 vs 2025 — yet enterprise spend rises anyway.
Why bills climbAgents plan, use tools and retry, burning 5-30x the tokens of a single chatbot reply.
DemandOpenRouter is routing tens of trillions of tokens, ~5x its volume six months earlier — and partly capped by sold-out capacity.
Open metatrends.substack.com →

How this connects

Tap a node to open it