> ## Documentation Index
> Fetch the complete documentation index at: https://docs.guild.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# LLM Pricing & Cost Calculation

> Reference guide on default LLM pricing list rates and cost calculations on the Guild platform.

The Guild platform computes estimated LLM spend in real-time based on industry-standard per-million-token list prices. All calculations are handled linearly and accurately at the individual LLM call level.

## Default list prices

Below are the default list prices (in USD per Million Tokens) configured on the platform.

| Model / Tier                        | Input Price (per M) | Output Price (per M) | Cache Read (per M) | Cache Write (per M) |
| :---------------------------------- | :------------------ | :------------------- | :----------------- | :------------------ |
| **Anthropic Claude**                |                     |                      |                    |                     |
| `claude-sonnet-4-6`                 | \$3.00              | \$15.00              | \$0.30             | \$3.75              |
| `claude-sonnet-4-5-20250929`        | \$3.00              | \$15.00              | \$0.30             | \$3.75              |
| `claude-haiku-4-5-20251001`         | \$1.00              | \$5.00               | \$0.10             | \$1.25              |
| `claude-opus-4` / `claude-opus-4-7` | \$15.00             | \$75.00              | \$1.50             | \$18.75             |
| **OpenAI GPT**                      |                     |                      |                    |                     |
| `gpt-4o-2024-08-06`                 | \$2.50              | \$10.00              | \$1.25             | \$2.50              |
| `gpt-4.1`                           | \$2.00              | \$8.00               | \$0.50             | \$2.00              |
| **Google Gemini**                   |                     |                      |                    |                     |
| `gemini-3.1-pro-preview`            | \$1.25              | \$10.00              | \$0.3125           | \$0.00              |
| `gemini-3-pro-preview`              | \$1.25              | \$10.00              | \$0.3125           | \$0.00              |
| `gemini-3.5-flash`                  | \$1.50              | \$9.00               | \$0.15             | \$0.00              |

### Default fallback rate

If a model name is not recognized or does not match any of the custom tiers above, the system logs a warning and falls back to the default list rate (matching the Claude Sonnet 4.6 tier). The Models breakdown table displays a warning indicator next to the model name to signal that the spend is an estimate.

* **Input**: \$3.00 per Million
* **Output**: \$15.00 per Million
* **Cache Read**: \$0.30 per Million
* **Cache Write**: \$3.75 per Million

## Prompt caching dynamics

To provide highly accurate cost accounting, prompt caching is calculated uniquely per provider:

* **Anthropic / OpenAI**: Prompt cache write tokens are priced at a premium rate (`cache_write` / `cache_create`), and subsequent hits are charged at a heavily discounted `cache_read` rate.
* **Google Gemini**: Google structures prompt caching differently, billing cache storage per hour rather than a per-token write rate. As a result, `cache_write_tokens` are priced at **\$0.00** (`cache_create: 0.0`), while `cache_read_tokens` represent the discounted input rate.
* **Double-charging prevention**: To prevent double-charging, the platform's query layers automatically deduct any `cache_read_tokens` from the billable `input_tokens` count on each LLM call before applying the pricing rates. Thus, the calculation always evaluates as `billable_input = max(input_tokens - cache_read_tokens, 0)`.
