LLM Cost Per Conversation Calculator
Find the true cost of a chatbot or agent conversation, where the full history is resent on every turn. Enter the system, user, and assistant token sizes and the number of turns, and get the cost of one conversation plus a daily and monthly projection for your traffic. The cost grows quadratically with length — this shows exactly how much. Numbers update as you type. Prices as of Jun 25, 2026 — sources; every field is editable.
| Turn | Input billed | Output billed | Cumulative input |
|---|---|---|---|
| 1 | 300 | 300 | 300 |
| 2 | 700 | 300 | 1,000 |
| 3 | 1,100 | 300 | 2,100 |
| 4 | 1,500 | 300 | 3,600 |
| 5 | 1,900 | 300 | 5,500 |
Each turn resends the system prompt plus all prior turns, so input billed per turn climbs while output stays flat. This table re-renders as you change the shape above.
How it works
A language model has no memory of its own. Between one API call and the next it forgets everything, so the only way to hold a coherent conversation is to replay the whole transcript on every single turn. The system prompt, all earlier user questions, all earlier assistant answers, and the new message — every token of it is sent again, and every token is billed again. What looks like a five-message chat is really five calls of ever-growing size stacked on top of each other.
This is why conversation cost behaves so differently from one-shot cost. The first turn is cheap; the last turn pays to re-read the entire preceding conversation. Because each turn's input is roughly proportional to how far into the conversation you are, the cumulative input cost grows with the square of the turn count, not linearly. Double the conversation length and you roughly quadruple the input bill. Only the output side stays linear — one fresh reply per turn. The table above lays this out turn by turn so the quadratic creep is impossible to miss: watch the "input billed" column climb while "output billed" holds steady.
Turn t input = system + (all prior user + assistant tokens) + new user message
Total input = Σ over all turns of (history so far + user tokens)
Total output = turns × assistant tokens
Conversation cost = (total input ÷ 1M × input price) + (total output ÷ 1M × output price)
A worked example
Using the defaults — a 200-token system prompt, 100-token user messages, 300-token replies, over 5 turns on Claude Sonnet-class (Anthropic) at $3/$15 per 1M:
- Turn 1 input is just 300 tokens; by turn 5 it has grown to 1,900 tokens — the history piling up.
- Total billed input across all 5 turns: 5,500 tokens
- Total billed output: 5 × 300 = 1,500 tokens
- Conversation cost: $0.039
- At 1,000 conversations/day: ≈ $39.00/day, ≈ $1,170/month
Notice that the five turns together resend the early messages many times over — that re-processing, not the replies, is where the money goes. Raise the turn count to 10 and the per-conversation cost more than doubles, the signature of quadratic growth. The fix is to cap or summarise history; cache the system prompt to soften the fixed part with the cached & batch discount calculator, and price a single isolated turn with the token cost calculator.
Frequently asked questions
Why does a multi-turn conversation cost more than the sum of its messages?
Because LLMs are stateless: the model remembers nothing between calls, so to keep context you must resend the entire conversation history on every turn. Turn 5 pays to process turns 1–4 all over again, plus the new message. The input you are billed for grows with each turn, so total cost rises faster than linearly — roughly quadratically with conversation length. With the defaults, one 5-turn conversation costs $0.039.
What exactly grows each turn?
Only the input side grows. Each turn you resend the system prompt plus every prior user and assistant message, then add the new user message — so billed input tokens climb turn after turn. Output stays roughly constant per turn (one fresh answer). That is why long conversations are dominated by re-processing old context, not by generating new replies.
How can I reduce conversation cost?
Three proven levers: cap the history (keep only the last N turns or a running summary instead of the full transcript), cache the system prompt so the fixed part of the context is billed at a fraction of its price, and shorten answers where you can. Summarising old turns into a compact memory is the single biggest win for long sessions — it breaks the quadratic growth.
How do I get from one conversation to a monthly bill?
Multiply the per-conversation cost by your conversations per day, then by ~30. With the defaults — $0.039 per conversation × 1,000 conversations/day — that is about $39.00/day, or roughly $1,170/month. Change the conversations-per-day input to size your own traffic.
Does prompt caching help with conversations?
Yes, especially for the fixed system prompt and any long, unchanging context (instructions, knowledge base). Caching re-prices that repeated input at a small fraction, which directly attacks the part of the conversation cost that grows. Model the effect with the cached & batch discount calculator.
How current are these prices?
The bundled defaults are publicly listed prices verified on Jun 25, 2026, linked to source below. Every field — token sizes, turns, prices, conversations per day — is editable, so the calculator stays correct even if a default goes stale. Always confirm current pricing with the provider.
Disclaimer. LLMTCO provides cost estimates and planning tools for informational purposes only. AI API and GPU prices change frequently; bundled defaults reflect publicly listed prices as of the verification date shown (Jun 25, 2026) and may be out of date — always confirm current pricing with the provider. These figures are estimates, not financial, tax, or procurement advice, and do not capture every real-world factor (latency, reliability, compliance, data privacy, engineering time).