Sources
Every price in the dataset links to the provider's official page and carries a verification date. Every formula is documented on the methodology page. Prices are publicly listed list prices and are convenience defaults — overridable in every tool.
API pricing
- Claude Sonnet-class (Anthropic) — $3/$15 per 1M (in/out), verified Jun 25, 2026. Official pricing →
- GPT-4o-class (OpenAI) — $2.5/$10 per 1M (in/out), verified Jun 25, 2026. Official pricing →
- Gemini Flash-class (Google) — $0.3/$2.5 per 1M (in/out), verified Jun 25, 2026. Official pricing →
- Llama-70B via managed API (e.g. Groq/Together) — $0.6/$0.8 per 1M (in/out), verified Jun 25, 2026. Official pricing →
GPU pricing
- NVIDIA A100 80GB — 80 GB, from $1.59/hr (cloud), verified Jun 25, 2026. Source →
- NVIDIA H100 80GB — 80 GB, from $2.99/hr (cloud), verified Jun 25, 2026. Source →
- NVIDIA L40S 48GB — 48 GB, from $0.99/hr (cloud), verified Jun 25, 2026. Source →
- NVIDIA RTX 4090 24GB — 24 GB, from $0.69/hr (cloud), verified Jun 25, 2026. Source →
Open-weight models
- Llama-3 70B — 70B parameters, Llama Community. Model card →
- Mixtral 8x7B — 47B parameters, Apache-2.0. Model card →
- Llama-3 8B — 8B parameters, Llama Community. Model card →
- Qwen2 7B — 7B parameters, Apache-2.0. Model card →
Formulas & references
- Token pricing and $/1M-token math — see methodology §1.
- GPU amortization and cost-per-token — methodology §2–§6.
- VRAM estimation (bytes-per-parameter by quantization, KV-cache headroom) — methodology §7.
- FX rates (EUR/GBP): Frankfurter, cached daily with a bundled fallback.
Spotted a stale price or an error? Let me know — corrections are credited.