AI Agent OpenRouter 2026.06.06

OpenRouter Weekly Token Rankings: Billing Data Does Not Lie—Who Really Leads?

While MMLU and HumanEval fight over who is smartest on stage, the OpenRouter weekly token rankings track something else: how many trillions of tokens developers and enterprises actually routed to each model in the past seven days. Billing does not lie—money spent and traffic served are closer to real AI adoption than any one-off benchmark.

This article is for developers, tech leads, and procurement leads who must explain “what the market really uses” to their teams: ① why rolling weekly token data is more trustworthy than eval leaderboards; ② how the week ending 2026-05-24 hit 28.9 trillion tokens globally and shifted the China–US split; ③ the week’s Top 10 model list and DeepSeek’s matrix dominance; ④ Anthropic’s premium paradox—token share falling while dollar revenue stays high; ⑤ counterintuitive findings from the OpenRouter + a16z report; ⑥ a six-step OpenRouter routing checklist and why 24/7 agent hosts belong on bare-metal cloud Macs. Data source: OpenRouter Rankings (7-day rolling window, through 2026-05-24).

01 Why OpenRouter weekly volume beats MMLU leaderboards

OpenRouter is one of the largest neutral LLM API aggregators: 300+ models from 60+ providers including OpenAI, Anthropic, Google, and DeepSeek, with 8M+ users and roughly 100 trillion tokens processed monthly. Its Rankings page rolls up input plus output token throughput on a 7-day window and updates weekly—the most direct public view of “who is actually being called.”

  • Pain point one: benchmarks test ceilings; billing tests defaults. Lab single-turn scores do not reflect multi-step agents, retries, or tool-call costs. Leaderboard leaders are often Flash tiers and open MoE models, not launch-stage Opus flagships.
  • Pain point two: vendor self-reported data is hard to compare. Each provider uses different eval sets and inference tiers. OpenRouter aggregates under one billing and routing layer so cross-model weekly token volume sorts cleanly.
  • Pain point three: monthly totals hide weekly inflection points. New models (Hy3 Preview, Owl Alpha) often spike in week-over-week growth first; quarterly reports miss routing adjustment windows.
  • Pain point four: token share and dollar revenue can diverge. Expensive closed models can lose token share while still owning revenue—procurement that only watches “who is #1” misreads budget structure.

Core thesis: token volume is the thermometer of real AI adoption; the weekly rolling window is the EKG that catches short-term shifts.

For agent capability matrices and June 2026 selection snapshots, see our OpenRouter agent selection guide; this piece focuses on weekly billing data and vendor commercial structure.

02 28.9T weekly tokens: global volume and the China–US split

Reporting period: May 18–24, 2026 (OpenRouter official 7-day rolling window). Global weekly volume reached 28.9 trillion tokens, up +7.4% week over week—the fifth consecutive weekly gain. The same window a year earlier was about 2.4 trillion, roughly a 12× year-on-year jump as AI workloads scale.

OpenRouter global and regional weekly token overview (2026-05-18 to 2026-05-24)
Metric Value WoW Read
Global weekly volume 28.9T tokens +7.4% Fifth straight weekly rise; platform pie still expanding
China-model weekly volume 9.223T tokens +19.89% Growth well above global average
US-model weekly volume 4.93T tokens +16.27% Still large in absolute terms, but weekly volume overtaken by China models
China vs US China #1 for four straight weeks China share was <2% in early 2025; first passed US in Feb 2026; ~45%+ by May

The regional story is not about nationality labels—it is about open source plus ultra-low API pricing reshaping default routes. Developers push agent loops, batch jobs, and coding tasks to DeepSeek Flash, Hy3, MiniMax, and similar tiers, while Western closed flagships stay in high-unit-price, low-token enterprise reasoning.

03 Week of May 24, 2026 Top 10: who captured the most weekly tokens?

The table below ranks models by weekly token volume (input + output). Three DeepSeek models land in the top nine; the family totals about 5.74 trillion tokens (+25.9% WoW), topping vendor weekly volume for two straight weeks—ahead of Anthropic and Google.

OpenRouter model weekly token Top 10 (through 2026-05-24)
Rank Model Vendor Weekly tokens WoW Notes
1 DeepSeek-V4-Flash DeepSeek 3.43T +66% Default for agent workflows; ultra-low price
2 Tencent Hy3 Preview Tencent 3.07T +16% Still growing after free tier ended
3 Claude Sonnet 4.6 Anthropic 1.35T 1M context; enterprise coding workhorse
4 DeepSeek-V3.2 DeepSeek 1.31T Low-cost long tail; roleplay active
5 Owl Alpha OpenRouter 1.15T +29% Free agent-tuned; 1M context
6 Gemini 3 Flash Preview Google 1.06T Multimodal; academic and medical
7 DeepSeek-V4-Pro DeepSeek 1.00T Matrix flagship (family total 5.74T)
8 MiniMax M2.7 MiniMax 806B Long-context value
9 Grok 4.1 Fast xAI 721B 2M context; legal workloads
10 Step 3.5 Flash StepFun 673B Fast, low-cost batch processing

Notable moves that week: Kimi K2.6 had ranked #6 the prior week but fell out of the Top 10—weekly boards are extremely sensitive to hype rotation. DeepSeek-V4-Flash at +66% and Owl Alpha at +29% show default agent routing accelerating toward “ultra-low price + long context + stable tool calls,” not toward the most expensive flagships.

04 Anthropic’s premium paradox: token share down, revenue share near half

Beyond per-model weekly boards, OpenRouter exposes vendor-level token share and dollar revenue share—stack both to see how the 2026 AI market layers.

2026 AI market three-tier structure (token volume vs willingness to pay)
Tier Representative models Token profile Revenue profile Typical use
High value, low traffic Claude Opus 4.6 Tiny token share Very high unit price; monthly revenue in tens of millions USD Enterprise complex reasoning, high-risk decisions
Mid value, mid traffic Gemini 3 Flash Moderate token share Mid unit price; multimodal premium Academic, medical, multimodal analysis
Ultra-low price, high traffic DeepSeek / MiniMax / StepFun Weekly board leaders; fastest growth Low revenue per token; wins on scale Agents, coding, batch jobs

Anthropic’s paradox is stark in weekly data: ~12% token share (down from ~25% a year ago) but ~46% dollar revenue share. Enterprise buyers still pay premium rates for Claude, especially Opus for hard reasoning—yet traffic leadership has shifted to China’s open matrix and free agent models. Claude Opus 4.6 monthly tokens may be a fraction of DeepSeek’s family total, but reported monthly revenue can still land near $25M (public reporting range).

For developers: individuals and small teams pick defaults from the weekly board; CFOs read revenue share to see who earns API dollars. You need both tables.

05 Benchmarks inversely correlate with share? a16z report and citeable data

The OpenRouter + a16z 2025 AI Usage Report (built on ~100 trillion tokens of anonymized metadata) highlights a counterintuitive pattern: benchmark scores and real market share are nearly inversely related. The “cheap and steady enough” models absorb the most traffic; eval champions often stay on keynote slides.

  • Reason one: developers optimize inference cost, not peak IQ. An agent pipeline running overnight makes price gaps hurt more than small score gaps.
  • Reason two: agents need stability and API latency. One failed tool call and retry can cost more than +2 MMLU points in theory.
  • Reason three: coding is now the largest single use case. Coding-related traffic rose from ~11% in early 2025 to over 50%—explaining why DeepSeek Flash and Sonnet 4.6 stay on the board.

Citeable technical data (public sources at writing; re-check OpenRouter live pages before routing):

  • Global weekly volume: 28.9T tokens (2026-05-18 to 05-24), +7.4% WoW; ~12× vs the same window one year earlier.
  • DeepSeek family weekly total: 5.74T tokens, +25.9% WoW; V4-Flash alone 3.43T, +66% in one week.
  • China vs US weekly volume: China models 9.223T (+19.89%) vs US models 4.93T (+16.27%); China #1 for four consecutive weeks.
  • Anthropic dual metrics: ~12% token share vs ~46% dollar revenue share; token share was ~25% one year ago.
  • Coding task share: OpenRouter + a16z report: from 11% in early 2025 to 50%+, the platform’s largest single category.

Bottom line: numbers on the bill are more honest than any eval leaderboard. The weekly board is the highest-frequency, lowest-cost signal for tuning OpenRouter routes.

06 Six-step weekly ranking tracker and routing checklist

  1. Check the board every Monday: Open openrouter.ai/rankings, log Top 10 weekly tokens and WoW deltas; smoke-test any new entrant or model with >30% WoW growth for one hour.
  2. Split default vs escalation routes: Point 80% of agent steps at DeepSeek-V4-Flash or Sonnet 4.6; escalate to V4-Pro / Opus only after two failures or on high-risk tasks.
  3. Compare token and dollar tables: When reporting to finance, capture vendor token share and revenue share together—avoid confusing “most used” with “most budget.”
  4. Pick by scenario, not keynote: Agents and batch → Flash tiers; enterprise hard reasoning → Opus; multimodal → Gemini Flash; watch high-growth newcomers like Hy3 and Owl Alpha.
  5. Set spend limits and weekly exports: Configure monthly caps per OpenRouter project key; export usage weekly and cross-check your routes against ranking shifts.
  6. Deploy a 24/7 host: Store API keys, routing config, and launchd units on a dedicated Mac; lid-close kills long agents—use bare-metal macOS (see OpenClaw remote Mac troubleshooting).

Changing routes from the weekly board alone does not answer who runs the agent. Personal Macs stop when they sleep. Oversubscribed VPS hosts often lack official macOS, so Metal and TCC guarantees fail and SSH jitter breaks multi-step tool loops. Shared spare hardware rarely matches Xcode/CLI versions or key rotation policy.

For teams running Cursor Agent, OpenClaw Gateway, and iOS CI together, JEXCLOUD multi-region bare-metal Macs are a stable production host: dedicated Apple Silicon, real macOS, ~120-second provisioning, monthly elastic terms—with model bills still on OpenRouter and machines cleanly separated from routing. See pricing and help center for specs and onboarding.