GPU TCO Calculator (Owned Hardware)

Q: What does the GPU TCO calculator actually include?

Three real cost buckets for hardware you own: (1) amortized capital — the purchase price spread evenly over the amortization window; (2) electricity — the card's power draw times your duty cycle times your $/kWh; and (3) a flat monthly overhead you enter for everything else (rack space, cooling, networking, spare parts, on-call time). It deliberately excludes resale value and financing interest, which are easy to bolt on afterward. See the methodology.

🛠️ Free toolUpdated Jun 25, 2026By Francesco Zinghinì

Work out the true monthly cost of a GPU you own, not just its sticker price. This calculator amortizes the purchase over your refresh window, adds the electricity it actually burns at your duty cycle, folds in a flat operational overhead, and then converts the total into a cost per million tokens using your measured throughput. Numbers update as you type. Defaults verified Jun 25, 2026 — sources; every field is editable.

Total / month$933.95

Total / year$11,207

Cost / 1M tokens$0.2843

Monthly cost breakdown
Component	Formula	Monthly
Amortized capital	15,000 ÷ 36	$416.67
Electricity	400W × 24 × 30 × 50% × $0.12	$17.28
Ops overhead	flat	$500.00
Total	sum	$933.95

Tokens served this month ≈ 3.3B at 2,500 tok/s and 50% utilization. Cost per token falls as you keep the card busier.

Owning is a fixed cost. The amortized capital is owed whether the GPU runs flat-out or sits idle. If you cannot keep it busy, your cost per token soars — a half-idle owned GPU can be more expensive per token than the API. Pair this with the utilization break-even tool before committing to hardware.

How it works

Total cost of ownership for hardware you buy has two fundamentally different parts. Capital is paid once and then sunk; we spread it evenly across the months you expect to use the card, which is the standard straight-line amortization an accountant would use. Operating cost is paid continuously: electricity, plus a catch-all overhead for everything that keeps the machine alive — rack space, cooling, networking, monitoring, the engineer who reboots it at 3am. Add the amortized capital to the operating cost and you have a single, honest monthly figure you can compare against a rented cloud GPU or a commercial API.

The subtle part is electricity. A data-center GPU is rated for a peak power draw (its TDP), but it only pulls that power while it is actually computing. So the energy term is scaled by utilization: a card at 50% duty cycle burns roughly half the kilowatt-hours of one pinned at 100%. The amortization, by contrast, does not move with utilization at all — you owe the same slice of the purchase price every month regardless of how busy the card is. That asymmetry is the whole story of self-hosting economics, and it is why the per-token cost can swing so wildly.

To get a price per million tokens we need to know how many tokens the GPU actually produces. That depends on throughput — the sustained tokens-per-second your model and serving stack achieve on this card — multiplied by the seconds in the month and your utilization. Divide the monthly dollar cost by tokens-produced-per-million and you have a directly comparable unit price. Because throughput varies enormously by model size, quantization, batch size, and framework, treat the default as a placeholder and replace it with a number you have actually benchmarked.

Formulas.
Amortized capital = purchase price ÷ amortization months
Electricity/month = watts ÷ 1000 × 24 × 30 × utilization × $/kWh
Total monthly = amortized capital + electricity + ops overhead
Tokens/month = throughput (tok/s) × 3,600 × 730 × utilization
Cost per 1M = total monthly ÷ (tokens/month ÷ 1,000,000)

A worked example

Take the defaults: a $15,000 GPU amortized over 36 months, drawing 400W at $0.12/kWh, running at 50% utilization with $500/mo of overhead, serving 2,500 tokens/second.

Amortized capital: $15,000 ÷ 36 = $416.67/mo
Electricity: 400W ÷ 1000 × 24 × 30 × 0.5 × $0.12 = $17.28/mo
Ops overhead: $500.00/mo
Total: $416.67 + $17.28 + $500.00 = $933.95/mo ($11,207/yr)
Tokens/month: 2,500 × 3,600 × 730 × 0.5 ≈ 3.3B
Cost per 1M tokens: $933.95 ÷ 3.3B ≈ $0.2843

Notice how small the electricity term is next to the amortized hardware — for a modern data-center GPU, the purchase price dominates. That means the single biggest lever on your cost per token is utilization: the same card at 50% utilization costs twice as much per token as it would at 100%, because the fixed amortization is spread over half as many tokens. Once you have a monthly figure, sanity-check it against renting the same card in the cloud GPU cost calculator, and against paying per token in the API vs self-hosting comparator. If your throughput assumptions are shaky, plan capacity properly with the throughput planner and confirm the card can even hold your model with the VRAM fit checker.

For reference GPU specifications and street prices, see the GPU pricing dataset; for the models you might run on this hardware, see the open-weight model dataset. Full derivations live in the methodology, and every default is attributed in sources.

Frequently asked questions

What does the GPU TCO calculator actually include?

Three real cost buckets for hardware you own: (1) amortized capital — the purchase price spread evenly over the amortization window; (2) electricity — the card's power draw times your duty cycle times your $/kWh; and (3) a flat monthly overhead you enter for everything else (rack space, cooling, networking, spare parts, on-call time). It deliberately excludes resale value and financing interest, which are easy to bolt on afterward. See the methodology.

How is the amortized monthly cost calculated?

Straight-line: purchase price ÷ amortization months. At the defaults that is $15,000 ÷ 36 = $416.67/month. A 36-month window matches a typical three-year hardware refresh; shorten it if you expect the card to be obsolete sooner, lengthen it if you plan to run it into the ground.

Why does utilization change the electricity bill but not the amortization?

Capital is sunk the day you buy the card — you owe the amortization whether it runs 24/7 or sits idle, so it is fixed. Power is only drawn while the GPU is working, so the energy term scales with utilization. That asymmetry is exactly why an idle owned GPU is so expensive per token: the big fixed cost gets spread over very few tokens.

How do you turn a monthly cost into a price per million tokens?

Divide the monthly cost by the tokens the GPU actually produces that month. Tokens/month = throughput (tok/s) × 3,600 × 730 hours × utilization. At 2,500 tok/s and 50% utilization that is about 3.3B tokens, so $933.95 ÷ 3.3B ≈ $0.2843 per 1M tokens. Throughput is a real measured number for your model and GPU — benchmark it, don't guess.

Owned hardware or rented cloud GPU — which is cheaper?

Owning wins at high, sustained utilization over a long horizon, because you stop paying the cloud's margin. Renting wins for bursty or short-lived workloads because you pay only for hours used. Compare your owned monthly figure here against the cloud GPU cost calculator, and check both against the API with the API vs self-hosting comparator.

Are these default prices current?

The bundled GPU purchase prices and power figures were verified on Jun 25, 2026 and each links to its source. They are convenience defaults only — every field is editable, so the calculator stays correct even when a default goes stale. Always confirm current hardware and electricity prices for your region.

Sources & pricing references

Disclaimer. LLMTCO provides cost estimates and planning tools for informational purposes only. AI API and GPU prices change frequently; bundled defaults reflect publicly listed prices as of the verification date shown (Jun 25, 2026) and may be out of date — always confirm current pricing with the provider. These figures are estimates, not financial, tax, or procurement advice, and do not capture every real-world factor (latency, reliability, compliance, data privacy, engineering time).