LLM API Token Cost Calculator
Work out exactly what your text will cost on a commercial LLM API. Enter your monthly input and output token volume, pick a model (or type your own per-million prices), and get the total cost plus a clear per-1M breakdown of where the money goes. Numbers update as you type. Prices as of Jun 25, 2026 — sources; every price is editable.
| Side | Tokens | Price / 1M | Cost |
|---|---|---|---|
| Input | 1M | $3.00 | $3.00 |
| Output | 500K | $15.00 | $7.50 |
How it works
An LLM API meter has two dials, not one. The first counts every token in your prompt — the system message, the conversation history, any retrieved documents, the user's new question. The second counts every token the model writes back. Each dial has its own price, and the two are added together. There is no monthly minimum, no seat fee: you pay strictly for what flows through, which is what makes per-token math worth getting right before you scale.
The reason the breakdown matters is that the two sides rarely cost the same. Because output is priced several times higher than input, a workload's shape drives its bill as much as its size. Two teams pushing the same total token volume can see wildly different invoices: one summarising long documents into short notes (input-heavy, cheap) and one drafting long articles from short briefs (output-heavy, expensive). Reading the per-1M breakdown above tells you which lever to pull — trim the prompt, or rein in generation length.
Input cost = input tokens ÷ 1,000,000 × input price
Output cost = output tokens ÷ 1,000,000 × output price
Total = input cost + output cost
Blended price / 1M = (input × input price + output × output price) ÷ (input + output)
A worked example
Take the defaults: 1M input tokens and 500K output tokens on Claude Sonnet-class (Anthropic), priced at $3 per million input and $15 per million output.
- Input: 1,000,000 ÷ 1,000,000 × $3 = $3.00
- Output: 500,000 ÷ 1,000,000 × $15 = $7.50
- Total: $3.00 + $7.50 = $10.50
- Blended price: $7.00 per 1M tokens across this 2:1 mix
Notice that the half-million output tokens cost more than the full million input tokens — the clearest illustration of why generation length dominates an LLM bill. Switch the model in the dropdown to see how a cheaper provider rescales the same workload, or type your own negotiated rates into the price fields. To project this across a whole month of traffic, use the monthly API spend calculator; to compare every provider on one screen, see the provider price comparison.
Frequently asked questions
How is the cost of LLM tokens calculated?
Every API bills you separately for the tokens you send (input/prompt) and the tokens the model generates (output/completion). The formula is simply (input ÷ 1,000,000 × input price) + (output ÷ 1,000,000 × output price). With the defaults — 1M input and 500K output at $3/$15 per million — that is $3.00 + $7.50 = $10.50.
Why are output tokens more expensive than input tokens?
Output (generation) is the compute-heavy half of inference: the model produces one token at a time, each pass running the full network, while input tokens are processed in parallel during a single prefill. Most providers therefore price output 3–5× higher than input. In the default model that is $3 input versus $15 output per million tokens, so output-heavy workloads (long answers, summaries, code generation) cost far more per request than input-heavy ones (long context, short reply).
What is a token, in practical terms?
A token is a chunk of text the model reads or writes — on average roughly ¾ of an English word, or about 4 characters. So 1,000,000 tokens is on the order of 750,000 words. If you only know your text in words, multiply by about 1.33 to estimate tokens; in characters, divide by about 4. These are heuristics — the exact count depends on the tokenizer and the language.
What is the blended price per million tokens?
The blended price is the single average rate you effectively pay across your particular input/output mix: (input × input price + output × output price) ÷ (input + output). For the default 2:1 input-to-output mix it works out to $7.00 per million. It is the most honest single number for comparing your real workload across providers, because it already weights each side by how much of it you actually use.
Does this include prompt caching or batch discounts?
No — this calculator shows the straight, undiscounted list-price cost. If you reuse a large fixed prompt across many calls, prompt caching can cut the input portion sharply; offline/batch processing often gets a flat discount too. Model those with the cached & batch discount calculator.
How current are these prices?
The bundled defaults are publicly listed prices verified on Jun 25, 2026, each linked to its source. They are convenience defaults only — both price fields are editable, so the calculator stays correct even if a default goes stale. Always confirm current pricing with the provider.
Disclaimer. LLMTCO provides cost estimates and planning tools for informational purposes only. AI API and GPU prices change frequently; bundled defaults reflect publicly listed prices as of the verification date shown (Jun 25, 2026) and may be out of date — always confirm current pricing with the provider. These figures are estimates, not financial, tax, or procurement advice, and do not capture every real-world factor (latency, reliability, compliance, data privacy, engineering time).