Question 1

How is the cost of LLM tokens calculated?

Accepted Answer

Every API bills you separately for the tokens you send (input/prompt) and the tokens the model generates (output/completion). The formula is simply (input ÷ 1,000,000 × input price) + (output ÷ 1,000,000 × output price). With the defaults — 1M input and 500K output at $3/$15 per million — that is $3.00 + $7.50 = $10.50.

Question 2

Why are output tokens more expensive than input tokens?

Accepted Answer

Output (generation) is the compute-heavy half of inference: the model produces one token at a time, each pass running the full network, while input tokens are processed in parallel during a single prefill. Most providers therefore price output 3–5× higher than input. In the default model that is $3 input versus $15 output per million tokens, so output-heavy workloads (long answers, summaries, code generation) cost far more per request than input-heavy ones (long context, short reply).

Question 3

What is a token, in practical terms?

Accepted Answer

A token is a chunk of text the model reads or writes — on average roughly ¾ of an English word, or about 4 characters. So 1,000,000 tokens is on the order of 750,000 words. If you only know your text in words, multiply by about 1.33 to estimate tokens; in characters, divide by about 4. These are heuristics — the exact count depends on the tokenizer and the language.

Question 4

What is the blended price per million tokens?

Accepted Answer

The blended price is the single average rate you effectively pay across your particular input/output mix: (input × input price + output × output price) ÷ (input + output). For the default 2:1 input-to-output mix it works out to $7.00 per million. It is the most honest single number for comparing your real workload across providers, because it already weights each side by how much of it you actually use.

Question 5

Does this include prompt caching or batch discounts?

Accepted Answer

No — this calculator shows the straight, undiscounted list-price cost. If you reuse a large fixed prompt across many calls, prompt caching can cut the input portion sharply; offline/batch processing often gets a flat discount too. Model those with the cached & batch discount calculator.

Question 6

How current are these prices?

Accepted Answer

The bundled defaults are publicly listed prices verified on Jun 25, 2026, each linked to its source. They are convenience defaults only — both price fields are editable, so the calculator stays correct even if a default goes stale. Always confirm current pricing with the provider.

LLM API Token Cost Calculator

How it works

A worked example

Frequently asked questions