AI inferencepriced live.
What every major model charges, ranked at your token mix. Every number on this page comes from models.dev via the Suan Pricing Index. If models.dev doesn’t have a model, it isn’t shown.
What You Get
More than a simple price lookup.
Live rates, not last quarter's
Every price comes from models.dev via the Suan Pricing Index. If models.dev doesn't index a model, we don't display it. No static fallbacks. No made-up numbers.
Ranked at your token mix
Headline rates are misleading when output dominates your bill. Sort by per-request cost at your actual input/output ratio to see who's actually cheap for you.
Capabilities, sourced
Reasoning support, context window, and provider key all read from models.dev. The Source column shows exactly which provider key each price came from.
Shareable
Copy a link to share your exact configuration. Present the numbers to your team or stakeholders.
How It Works
- 01
Read the live price grid
Every model models.dev indexes - sortable by cheapest input, cheapest output, largest context, or cheapest for your token mix. Click any row to set it as primary.
- 02
Plug in your workload
Pick a scenario for typical token volumes, or type your own. Toggle up to 4 comparison models from the grid to see them side-by-side.
- 03
See where cost lives
Input vs output split, top-driver breakdown, and the cheapest alternative in your comparison set - all derived from your actual config and live prices.
Go Deeper
The thinking behind the numbers.
The Inference Tax
Why your GenAI budget is hiding 80% of its real cost. The infrastructure iceberg beneath every API call.
Read article BlogThe Hidden Tax of AI
What CFOs aren’t seeing in their AI investments. The costs that compound before anyone notices.
Read article Free ToolCarbon Footprint
Tokens at scale have a carbon side too. Region-aware emissions for the same workload, SCI-aligned methodology.
Open the calculator
Spending more
than you should?
Let's find where your cloud and AI spend can work harder.