Token Demand Index

Name: Token Demand Index (AI inference consumption gauge)
Creator: Bargo
License: https://bargo.ai/terms

Tracked inference demand hit 44.7 trillion tokens per week, pushing the Token Demand Index to 189.9 — up 90% since 2026-05-06 and +74% over the past thirty days. The narrative of collapsing token prices obscures the reality that the market is expanding rapidly, with open-source models now handling 45.1% of volume at a demand-weighted effective price of just $2.18 per 1 million tokens. It is the output-side companion to the Compute Tightness Index (the GPU input).

189.9Demand index (100 = 2026-05-06)

44.7TTokens / week

▲ +74%Demand, 30 days

45.1%Open-source share

$2.18Effective $/1M

$97.4MImplied spend/week*

As of 2026-06-28, tracked AI inference runs at 44.7T tokens/week — up +74% in 30 days and +90% since 2026-05-06. Open-source carries 45.1% of those tokens, holding the demand-weighted price near $2.18/1M — a fraction of frontier list prices even as frontier models themselves got pricier.

⤓ Download full daily series (CSV)

Token demand vs the effective price of inference

Total tokens/dayEffective $/1M (right)

Who consumes the tokens

Open-sourceOtherClaudeOpenAIGooglexAI

By provider group — latest snapshot

Group	Blended $/1M	Tokens/week	Share	30d demand
Open-source	$0.40	20.2T	45.1%	▲ +110%
Other	$0.93	10.8T	24.3%	▲ +202%
Claude	$10.00	6.5T	14.5%	▲ +18%
OpenAI	$3.47	3.2T	7.1%	▲ +17%
Google	$0.85	3.8T	8.6%	▼ -8%
xAI	$1.56	129.4B	0.3%	▲ +16%

Open-source = aggregated weights-available labs (DeepSeek, Qwen, Llama, Mistral, etc.). "Other" = unclassified OpenRouter providers. Share = % of tracked tokens. *Implied spend = list price × tokens, not realized revenue.

Methodology

Built from OpenRouter's model rankings + pricing, grouped into Claude / OpenAI / Google / xAI / Open-source / Other. Token counts are OpenRouter's trailing-7-day figures (the rankings "week" view) — i.e. each point is tokens processed over the prior week, not a single day. Per group: blended median $/1M (75/25 input/output) and weekly total_tokens. Demand index = latest weekly total ÷ the first week's total × 100. Effective $/1M = demand-weighted blended price (as-of prices carried forward across snapshots where only one side updated). Series since 2026-05-06; it grows as new snapshots land. Full history as CSV. Coverage is OpenRouter traffic — a large but partial slice of the market.

Pair it with the Compute Tightness Index (GPU input) for both sides of the AI-compute economy. Query the raw data from an agent via the Bargo MCP endpoint (get_inference_economics).