Question 1

How do I estimate my monthly LLM API bill?

Accepted Answer

Start from traffic, not tokens. Multiply your number of requests per month by the average input and output tokens per request, then price each side. The formula is requests × ((input/req ÷ 1M × input price) + (output/req ÷ 1M × output price)). With the defaults — 100,000 requests of 1,000 in / 300 out on Claude Sonnet-class (Anthropic) — that is $750.00/month, or $9,000.00 per year.

Question 2

What counts as one request?

Accepted Answer

One request is a single API call: one prompt in, one completion out. In a chatbot, each user turn is usually one request — but remember the whole conversation history is resent every turn, so average input tokens per request climb as conversations get longer. If your product is conversational rather than one-shot, the cost-per-conversation calculator models that growth directly.

Question 3

How do I find my average tokens per request?

Accepted Answer

If you are already in production, divide your provider's reported token totals by your request count for a representative day. Before launch, estimate from a typical prompt: count the words in your system message, the retrieved context, and the user message, and multiply by about 1.33 to get tokens; do the same for a typical answer. The defaults here (1,000 in, 300 out) describe a fairly compact RAG-style call.

Question 4

Why show the annual figure too?

Accepted Answer

Because procurement and budgeting happen yearly. A workload that looks cheap at $750.00 a month is a $9,000.00 line item over a year — large enough to justify negotiating committed-use discounts, evaluating a cheaper model, or weighing self-hosting. The annual view also makes the cost of scaling traffic 10× visceral.

Question 5

How can I bring this number down?

Accepted Answer

Three levers, in order of usual impact: cut output length (the most expensive side), trim the prompt you resend on every call (prompt caching helps enormously for fixed context), and switch to a cheaper model for the requests that do not need a frontier one. The provider price comparison shows the same volume across every model at once.

Question 6

How current are these prices?

Accepted Answer

The bundled defaults are publicly listed prices verified on Jun 25, 2026, each linked to its source. Both price fields are editable, so the calculator stays correct even if a default goes stale. Always confirm current pricing with the provider.

Side	Per request	× requests	Monthly tokens
Input	1,000	100,000	100M
Output	300	100,000	30M

Monthly LLM API Spend Calculator

How it works

A worked example

Frequently asked questions