Skip to content
Published on

Is AI Really Slowing Down — Reading the 2026 AI Bubble Debate Through Datacenter Economics

Authors

Introduction — Why This Debate, Why Now

In the first half of 2026, the hottest debate in tech is happening not in model benchmarks but in accounting ledgers. The essay "AI is slowing down" from Ed Zitron's newsletter hit Hacker News and GeekNews hard, and almost simultaneously The Economist published a piece asking whether the stock market can swallow the mega-IPO candidates Anthropic, SpaceX, and OpenAI.

The core question of the debate is simple. How much revenue is required to justify the AI infrastructure investment already promised — and is that revenue actually materializing?

There are reasons this question feels especially heavy in June 2026. First, tallies show hyperscaler datacenter construction plans have reached a cumulative 190GW. Second, some analysts argue that recouping this investment requires the AI industry to generate on the order of 2 trillion dollars in annual revenue by 2030. Third, amid all this, the IPO wave — including reports of an Anthropic S-1 filing — is gathering momentum, putting private-market valuations on the verge of public-market scrutiny.

This post decomposes both camps' claims into numbers, runs the comparison with the dotcom bubble, and lays out what developers and companies should do now. To say it upfront: the conclusion of this post is neither it is a bubble nor it is not. Only those who understand the arithmetic of both sides survive whichever way it tips.

Groundwork — Terms and Units

Before the main argument, let us pin down the terms and units that recur in this debate. In a fight over numbers, a feel for units is half the battle.

TermMeaningRule of thumb
GW (gigawatt)Unit of power. Datacenter scale expressed as power demand1GW is one large nuclear reactor, roughly 800k households
capexCapital expenditure. Cost of acquiring assets like datacenters and GPUsCombined annual AI capex of the big four runs to hundreds of billions of dollars
run rateRecent revenue annualizedFavored in growth-stage PR; overestimation risk
depreciationSpreading asset cost over its useful lifeGPUs typically 3-5 years; stretching the assumption inflates profit
tokenUnit of LLM throughputOne English word is about 1.3 tokens; this post is roughly 10k tokens
inferenceCompute for serving, as opposed to trainingIn 2026 the cost center of gravity has shifted from training to inference

The depreciation assumption in particular is the hidden powder keg of this debate. Whether GPU life is booked at 4 years or 6 changes hyperscaler annual profit by tens of billions of dollars, and several companies have in fact recently lengthened their life assumptions. Skeptics view this as borderline earnings cosmetics; defenders counter that software optimization genuinely extended the effective life of older chips.

The Slowdown Case — The Arithmetic Does Not Close

First, the slowdown camp — the skeptics. Their case rests on three numbers.

Number 1: 190GW of datacenters

The skeptics' starting point is the tally that major AI datacenter plans announced through 2026 sum to roughly 190GW. For intuition: 1GW is the output of one large nuclear plant, and average national power demand for a country like South Korea runs around 70-80GW. The announced plans alone would power two additional South Koreas — all devoted to AI compute.

Datacenter construction cost is estimated at roughly 35 to 50 billion dollars per GW (site, building, cooling, and the GPUs that dominate the bill). Applying a conservative 40 billion dollars per GW, 190GW implies about 7.6 trillion dollars of cumulative investment. Not all of it will be realized, of course — but even half means around 4 trillion dollars of capital bet on a single technology.

Number 2: The 2 trillion dollar annual revenue requirement

Investment must be justified by revenue. The skeptics' calculation structure runs roughly as follows.

Back-solving required revenue (the skeptics' logic)

  Cumulative capex (2026-2030 est.)        ~3 to 4 trillion dollars
  GPU depreciation cycle                    3 to 5 years
  -> Infrastructure cost expensed yearly    ~800 billion dollars or so
  + Power / operations / payroll            hundreds of billions
  + Margin investors demand                 software-industry margins
  ─────────────────────────────────────────
  => AI-related annual revenue needed       ~2 trillion dollars
     around 2030

  Note: estimated total AI industry revenue in 2025 is at most
  a few hundred billion dollars (10-20 percent of the requirement)

For comparison, 2 trillion dollars a year is four times the entire global smartphone market of 2025 (about 500 billion dollars) and more than twice the entire global cloud market (about 900 billion dollars). The math says two new cloud industries must be built within five years.

Number 3: The inflection in growth rates

The third pillar of the skeptical case is softening demand-side indicators: analyses showing consumer chatbot traffic growth entering a plateau, surveys finding a large share of enterprise PoCs never convert to production, and the observation that perceived gains between model generations have narrowed, weakening the upgrade motive.

In summary, the slowdown logic goes: supply (infrastructure) is growing geometrically, demand (revenue) is not keeping pace, and the gap must eventually be paid for as someone's loss.

Token Economics — Is the Price Collapse a Blessing or a Curse?

To understand this debate properly, watch the movement of token prices. AI infrastructure revenue ultimately arrives as token sales (API) or token-backed subscriptions.

For several years, the price per token at constant capability has fallen at a frightening pace.

Price per million tokens at constant intelligence (conceptual)

price
(log scale)
  100 |  *
      |    *
   10 |      *
      |        *
    1 |          *
      |            *
  0.1 |              *  <- 2026: some tiers down 100x or more vs 2023
      +---------------------------------->
       2023   2024   2025   2026

Roughly 10x annual price declines, sustained for years

The effect of this collapse on the industry cuts both ways.

ViewpointWhat falling prices mean
Demand side (positive)Use cases that were uneconomical yesterday turn profitable today. If demand is price-elastic, total revenue grows
Supply side (negative)The revenue value of tokens produced by the same GPU shrinks every year. Payback periods on expensive hardware keep lengthening
Margin structureFrontier models keep a premium; one-generation-old models commoditize instantly. Margins are thick only right after a launch and thin quickly

The cost stack of a single inference

To follow the margin debate, look at the cost composition when a batch of tokens is sold. A conceptual synthesis of public estimates:

Cost decomposition per 100 units of API token revenue (conceptual)

  Revenue 100
   ├─ GPU depreciation       35-50   <- largest item, sensitive to life assumption
   ├─ Power                  10-15
   ├─ Datacenter operations   5-10
   ├─ Network / storage       3-5
   ├─ R&D allocation         10-20   <- allocated cost of training the next model
   └─ Margin                  ?      <- profit or loss depending on assumptions

  Note: how much training cost gets allocated into inference cost
        is the central fork of the profit/loss debate

Optimists say that looking only at marginal cost (producing more tokens on GPUs already bought), inference is already profitable. Skeptics say that once next-model training costs and GPU repurchase are allocated in, the industry as a whole loses money. Both are correct; they merely draw the accounting boundary in different places. This boundary question will largely be settled the moment audited financial statements emerge through IPOs — one more reason the 2026 IPO wave is the watershed of this debate.

The central issue is whether the Jevons paradox holds. If usage grows more than 10x when prices fall to a tenth, revenue grows. In 2024-2025 this pattern broadly held. The question is whether it can repeat indefinitely — and exactly here optimism and pessimism part ways.

The Counterarguments — Not a Slowdown, a Transition

Now the opposite camp: the continued-growth case. Their arguments are also grounded in numbers.

Counter 1: A structural explosion in inference demand

Consumer chatbot traffic may be flat, but machine-generated demand is exploding — that is the first counter. The dominant workload of 2026 is not a human typing questions into a chat box but agents autonomously executing long-running tasks.

  • One human question: a few thousand tokens
  • One agent coding task: hundreds of thousands to millions of tokens (planning, tool calls, self-verification loops included)
  • Reasoning model thinking tokens: tens of thousands of extra internal tokens per output

With agents in the Claude Code and Codex class routinely executing multi-hour tasks, per-user token consumption has jumped to tens or hundreds of times the chat-era level. This is why stagnant traffic growth and exploding token demand can both be true at once.

Counter 2: Revenue is becoming real

Against the claim that revenue is illusory, the rebuttal is that real numbers are filling in fast. The annualized revenue run rates of major model providers have multiplied severalfold each year since 2024, with coding agents and enterprise APIs at the center of that growth. In developer tooling especially, AI spend has already become a mandatory budget line rather than a substitution.

Counter 3: The IPO wave and capital-market scrutiny

The new variable in 2026 is the public market. Reports of an Anthropic S-1 filing, and the fact that The Economist is now analyzing whether equity markets can absorb three mega-listings — Anthropic, SpaceX, OpenAI — signal that the industry is moving past the private-market storytelling phase into the quarterly-earnings verification phase.

Optimists read this as proof of maturity: listing means opening the books, and you do not attempt a listing if the open books look bad. Skeptics read the same fact as an exit strategy — handing over shares at the most expensive possible moment. Two opposite readings of one event coexisting: the signature scenery of 2026.

Infrastructure Lock-in and Depreciation — A Structure Where Time Is the Enemy

The most technically interesting part of this debate is the economics of depreciation.

GPUs are not real estate. A datacenter shell lasts 20-30 years, but the GPUs inside exhaust their economic life in 3-5. When the next chip generation delivers multiples of the compute at the same power draw, there comes a point where older chips struggle to justify even their electricity bill.

The lifetime mismatch problem of AI datacenter assets

  Asset                 Share (rough)  Economic life   Risk
  ───────────────────────────────────────────────────────────
  GPUs / accelerators   60-70%        3-5 years       Very high
  Power / cooling       15-20%        10-15 years     Medium
  Building / land       10-15%        20-30 years     Low
  Networking            5-10%         5-7 years       Medium

  => Two-thirds of the investment demands reinvestment within 5 years
  => If revenue does not arrive within 5 years, the arithmetic collapses

Lock-in compounds this. Large AI companies have signed multi-year commitments to specific clouds, specific chip vendors, specific power contracts. If demand undershoots, these contracts become fixed costs squeezing the P&L. If demand overshoots, whoever locked in capacity wins. The long-term contracts signed today are, collectively, an enormous leveraged bet by the whole industry on a demand forecast.

The circular-deal controversy also deserves mention. Chip vendors invest in AI companies; AI companies spend the money on chips; the chip vendor's revenue lifts valuations further. This loop is the weak point skeptics cite most often. If a meaningful share of revenue is capital circulating inside the ecosystem, external demand may be smaller than the books suggest.

Power, the physical constraint

If depreciation is the constraint of time, power is the constraint of physics. A 190GW plan is not realized by capital alone.

  • Grid bottlenecks: building transmission to datacenter sites takes 5-10 years of permitting and construction. Capital is fast; the grid is slow.
  • The nuclear turn: hyperscalers signing reactor-restart deals and investing in SMRs (small modular reactors) is direct evidence of the bottleneck. But commercial SMR operation arrives in the early 2030s at the earliest.
  • Local friction: in datacenter-dense regions, electricity price increases and water usage have become political issues, feeding back into permitting risk.

Paradoxically, the power bottleneck is read differently by each camp. To the slowdown camp, it is proof the planned capacity cannot be realized. To the growth camp, it is an argument that constrained supply preserves the value of existing capacity. The same fact arming both sides — the defining pattern of this debate.

Comparison with the Dotcom Bubble — Similarities and Differences

No bubble debate is complete without the comparison to the 2000 dotcom crash. A fair side-by-side:

DimensionDotcom bubble (1999-2001)AI investment boom (2023-2026)
Core infrastructureFiber optics, telecom networksGPU datacenters, power
Infrastructure lifetimeDecades (fiber still in use today)Core asset 3-5 years (GPUs)
Revenue substanceMostly loss-making, many firms with negligible revenueLeaders hold massive real revenue
Funding sourceIPO proceeds, heavy retail participationBig tech operating cash flow plus private capital
Crash transmissionDirectly to retail via stock marketIndirect, via big tech P&L and private funds
Legacy of overinvestmentCheap networks became the foundation of Web 2.0What cheap compute will found remains unknown
Reality of the techReal, but priced 10 years earlyReal, and already in large-scale use

Two differences matter most.

First, revenue substance. Many dotcom-era firms had no revenue at all, while the leading AI companies of 2026 generate genuinely large revenue. The issue is not whether revenue exists, but whether its growth rate catches up to the investment rate.

Second, infrastructure lifetime. After the dotcom crash, fiber sold for pennies and underwrote twenty years of internet growth. GPUs go obsolete in five years, so the legacy that overinvestment bequeaths to the next generation may be far smaller than in 2000. Even if the bubble pops, what remains is different — which weakens the optimists' favorite argument that bubbles leave useful ruins.

Practical Implications for Developers and Companies

There is no need to wait for the macro debate to resolve. Some actions are valid under every scenario.

Strategy 1: Multi-vendor architecture

A system hard-coded to one model vendor is defenseless against price hikes, service changes, and in the worst case vendor disappearance. One layer of abstraction is the basic defense.

# Minimal skeleton of a model-vendor abstraction layer
from dataclasses import dataclass

@dataclass
class LLMResponse:
    text: str
    input_tokens: int
    output_tokens: int
    cost_usd: float

class LLMProvider:
    def complete(self, prompt: str, max_tokens: int) -> LLMResponse:
        raise NotImplementedError

class AnthropicProvider(LLMProvider):
    def complete(self, prompt: str, max_tokens: int) -> LLMResponse:
        ...  # Anthropic API call

class OpenAIProvider(LLMProvider):
    def complete(self, prompt: str, max_tokens: int) -> LLMResponse:
        ...  # OpenAI API call

class FallbackRouter(LLMProvider):
    """Auto-failover to the secondary vendor on outage or rate limits"""
    def __init__(self, primary: LLMProvider, secondary: LLMProvider):
        self.primary, self.secondary = primary, secondary

    def complete(self, prompt: str, max_tokens: int) -> LLMResponse:
        try:
            return self.primary.complete(prompt, max_tokens)
        except (RateLimitError, ProviderOutageError):
            return self.secondary.complete(prompt, max_tokens)

The key is concentrating every direct vendor-SDK import into one place in the codebase. An organization whose switching cost is one day and one whose switching cost is a quarter have fundamentally different price negotiating power.

Strategy 2: Making cost structure visible, then optimizing it

As AI spend grows, tokens must be managed like cloud cost. In priority order:

  1. Model routing: sending every request to a frontier model is like shipping every parcel by air. Route simple tasks — classification, extraction — to small models. This is the single highest-impact action in practice, typically cutting more than half of the bill.
  2. Prompt caching: caching system prompts and repeated context slashes input cost on those segments. Especially effective for agent workloads that re-reference the same context repeatedly.
  3. Batch processing: jobs with no real-time requirement (overnight analysis, bulk classification) typically cost half through batch APIs.
  4. Output length control: in workloads where most cost is output tokens, merely enforcing a JSON schema on responses reduces the bill.
# Minimal example of model routing
def route_request(task_type: str, complexity_score: float) -> str:
    if task_type in ("classify", "extract", "summarize_short"):
        return "small-fast-model"
    if complexity_score < 0.3:
        return "small-fast-model"
    if task_type == "agentic_coding":
        return "frontier-model"
    return "mid-tier-model"

The starting point of cost visibility is measurement. Lay down a minimal token-accounting middleware.

# Minimal skeleton of token cost accounting middleware
import time
from collections import defaultdict

PRICE_TABLE = {
    # (model, token kind) -> dollars per million tokens
    ("frontier-model", "input"): 3.00,
    ("frontier-model", "output"): 15.00,
    ("small-fast-model", "input"): 0.10,
    ("small-fast-model", "output"): 0.40,
}

usage_by_feature = defaultdict(float)

def track_llm_call(feature: str, model: str, resp) -> None:
    cost = (
        resp.input_tokens / 1e6 * PRICE_TABLE[(model, "input")]
        + resp.output_tokens / 1e6 * PRICE_TABLE[(model, "output")]
    )
    usage_by_feature[feature] += cost
    metrics.emit("llm_cost_usd", cost, tags={"feature": feature, "model": model})

The feature tag is the crux. An organization that sees only the total invoice spends days finding the cause of a cost spike; an organization with per-feature tagging sees it instantly on a dashboard. The lessons of cloud cost management transfer directly.

Strategy 3: Scenario planning for price volatility

Anyone responsible for an AI budget should design for all three price scenarios.

  • Continued price collapse: competition keeps pushing prices down. A lock-in-free architecture absorbs the gains immediately.
  • Price normalization (increases): current subsidy-flavored prices rise to reflect capital costs. If the skeptics are right, this is the most likely path. Only organizations with routing and caching already in place absorb the shock.
  • Vendor consolidation: exits or mergers among providers. Multi-vendor abstraction is the insurance policy.

Scenario Analysis — Three Futures

The macro outlook reduces to three scenarios. No probabilities attached — honestly, nobody knows.

DimensionScenario A: Soft landingScenario B: CorrectionScenario C: Continued growth
PathInvestment growth eases gently, demand slowly catches upMajor projects halted, private valuations slump, cascading layoffsAgent demand keeps consuming infrastructure, shortage persists
Token pricesGradual decline continuesShort-term hikes, then glut-driven crashPremium model prices hold, budget tier collapses
Developer hiringModerate growth centered on AI rolesShort-term shock to infra/platform rolesStrong demand across all roles
Startup climateSelective fundingCapital crunch — but bargain computeAbundant capital continues
Historical rhymeMid-2000s early cloudDotcom 2000, finance 2008Late-1990s internet adoption

Note that Scenario B (correction) has two faces. Just as the years right after the dotcom crash were the best growth environment Google and Amazon ever had, a fire sale of AI infrastructure would be a historic opportunity for whoever scoops up compute at distressed prices. Preparing for the correction scenario is not only defense — it is an offensive plan.

What to Watch — An Indicator Checklist

You cannot know the outcome in advance, but you can track which way it is tipping. A quarterly check of the following is recommended.

AI macro health checklist (quarterly)

[Demand signals]
- Are major providers' annualized revenue growth rates holding quarter over quarter
- Agent/API usage indicators (company disclosures, third-party estimates)
- Direction of PoC-to-production conversion rates in enterprise surveys

[Supply/investment signals]
- Capex guidance raised or cut in big tech quarterly earnings
- Frequency of datacenter project delay/cancellation news
- Hourly price trend on GPU rental marketplaces

[Capital market signals]
- Whether AI IPOs complete, and post-listing price resilience
- Frequency of down rounds in private markets
- Accounting coverage of circular deal structures between chip vendors and AI firms

[Price signals]
- Direction of frontier model API pricing (continued cuts vs reversal to hikes)
- Subsidy-rollback moves: shrinking free tiers, tightening usage caps

The price signals matter most. The moment subsidy-flavored pricing turns to increases is the moment demand's true price elasticity finally gets tested.

Frequently Asked Questions

Q. So is it a bubble or not?

This post's position is that the reality of the technology and the excess of the investment can both be true. Railways were real during the railway bubble; the internet was real during the dotcom bubble. More important than the bubble question: does my position survive a correction?

Q. Should individual developers care about this debate?

You are exposed through two direct channels. First, API pricing changes rewrite the cost structure of side projects and startups. Second, in a correction scenario AI infrastructure hiring wobbles first. Multi-vendor readiness and cost optimization experience are resume lines whose value rises in every scenario.

Q. Is the 2 trillion dollar figure trustworthy?

Treat it as an interval estimate. Stretch depreciation from 5 to 6 years and cut realized capex from 80 to 50 percent, and required revenue falls below 1 trillion dollars; tighten the assumptions and it exceeds 3 trillion. The point is not the precise number but that under any assumptions, a gap of an order of magnitude or so separates current revenue from the requirement.

Q. Is moving into an AI role right now risky?

The sharper question is which roles survive Scenario B. Roles that consume tokens (AI applications, agent operations, cost optimization) have historically seen demand grow during corrections. Roles that supply tokens (greenfield infrastructure, model training) are directly exposed to the capex cycle.

A Critical View — The Weak Points of Both Camps

For balance, the weaknesses of each side's logic.

Weaknesses of the slowdown case:

  1. Measurement lag on demand. The explosion of agentic workloads is a phenomenon of the past year, so many of the statistics showing stagnation may be measuring chat-era indicators.
  2. Assumption-dependence of the 2 trillion dollar figure. The required-revenue estimate halves or doubles depending on depreciation cycle, realized capex ratio, and demanded margin. It looks precise; it is a wide interval.
  3. Content incentives. Bubble-crash prediction is an asymmetric game — forgotten if wrong, heroic if right — giving broadcasters an incentive to exaggerate.

Weaknesses of the growth case are no lighter:

  1. Demand quality. A large share of the token explosion comes from agents' trial-and-error loops. As efficiency improves, token consumption per task could plunge, bending the demand curve.
  2. Unverified willingness to pay. Current prices include competitive subsidy; whether demand holds at prices fully reflecting capital cost has never been tested.
  3. Circular-deal dependence. The scale of pure external demand, net of capital circulating within the ecosystem, is hard to confirm from public numbers.

Closing — A Balanced Conclusion

Summing up the debate:

What is certain: AI infrastructure investment is unprecedented in scale; current revenue alone cannot justify it; the bet is therefore a bet on explosive future demand growth. It is also certain that token demand itself is genuinely exploding in the agent era.

What is uncertain: whether that demand growth catches up to the investment before the depreciation clock runs out, and whether demand holds at unsubsidized prices.

My personal reading: the reality of the technology and the excess of the investment can be simultaneously true. Railroads and telecom were real technologies on which overinvestment piled, and real growth came only after the correction. If AI follows the same path, the question is not bubble or not, but when and in what form the correction arrives — and where I will be standing when it does.

For developers there is one comforting conclusion. Under any of the three scenarios, the value of multi-vendor architecture, cost optimization skills, and the ability to verify AI output does not fall. You cannot control the macro. You can control your position.

References