Question 1

Which LLM API is the cheapest?

Accepted Answer

It depends entirely on your input/output mix, but for the default volume here — 10M input + 2M output tokens/month — the lowest-cost option is Llama-70B via managed API (e.g. Groq/Together) at $7.60/month, versus $60.00 for the most expensive (Claude Sonnet-class (Anthropic)). That is a $52.40 (87%) gap for the identical workload. Change the volume above and the ranking can shift, because models weight input and output differently.

Question 2

Why does the cheapest provider change with my token mix?

Accepted Answer

Each model has its own ratio of input to output price. A model with cheap input but pricey output wins on long-context, short-answer work and loses on long-generation work. That is why this tool ranks on your blended cost rather than on a single headline number — the blended price per 1M already weights each side by how much of it you actually use.

Question 3

Are these the same models, just at different prices?

Accepted Answer

No — they are different model classes with different capabilities. A frontier model and a fast/cheap model are not interchangeable for every task. Treat this as a cost map, not a quality ranking: use it to find the cheapest model that is good enough for each slice of your workload, and route accordingly.

Question 4

What is the blended price per million?

Accepted Answer

It is the single average rate you pay across your specific mix: (input × input price + output × output price) ÷ (input + output). It collapses the two-dial pricing into one comparable number, so the rightmost column lets you line every provider up on equal terms for your workload.

Question 5

Should I just pick the cheapest row?

Accepted Answer

Only after weighing quality, latency, rate limits, region availability, and data-handling terms. The cheapest model that meets your quality bar is the right answer — not the cheapest model overall. Many teams run a mix: a cheap model for the bulk of traffic and a frontier model for the hard requests.

Question 6

How current are these prices?

Accepted Answer

The bundled defaults are publicly listed prices, each verified on the date shown in its row and linked to its source below. They are convenience defaults; the volume is editable so you can re-rank providers for your own workload. Always confirm current pricing with the provider.

Model / provider	$/1M in	$/1M out	Blended /1M	Monthly cost	Verified
Llama-70B via managed API (e.g. Groq/Together) cheapest	$0.60	$0.80	$0.63	$7.60	Jun 25, 2026
Gemini Flash-class (Google)	$0.30	$2.50	$0.67	$8.00	Jun 25, 2026
GPT-4o-class (OpenAI)	$2.50	$10.00	$3.75	$45.00	Jun 25, 2026
Claude Sonnet-class (Anthropic)	$3.00	$15.00	$5.00	$60.00	Jun 25, 2026

LLM Provider Price Comparison

How it works

A worked example

Frequently asked questions