For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
PlaygroundChangelogSign In
OverviewAPI ReferencePre-recorded STTStreaming STTVoice AgentsSpeech UnderstandingGuardrailsLLM GatewayFAQ
OverviewAPI ReferencePre-recorded STTStreaming STTVoice AgentsSpeech UnderstandingGuardrailsLLM GatewayFAQ
  • Getting started
    • Overview
    • Apply LLM Gateway to pre-recorded audio
    • Apply LLM Gateway to streaming audio
    • Specify fallback models
    • Prompt caching
    • Post-processing
    • Cloud endpoints & data residency
    • Troubleshooting
  • Use cases
    • Ask questions about your audio data
    • Build agentic workflows
    • Basic chat completions
    • Multi-turn conversations
    • Use tool calling with LLMs
    • Get structured JSON outputs
  • Guides
      • Setup An AI Coach With LLM Gateway
      • Generate Action Items with LLM Gateway
      • Prompt A Structured Q&A Response Using LLM Gateway
      • Estimate Input Token Costs for LLM Gateway
LogoLogo
PlaygroundChangelogSign In
On this page
  • Quickstart
  • Step-by-Step Guide
  • Install dependencies
  • Set up your API key
  • Transcribe your audio file
  • Calculate character count
  • Estimate tokens
  • Calculate input token costs
  • Next steps
GuidesBasic LLM Gateway workflows

Estimate Input Token Costs for LLM Gateway

Was this page helpful?
Previous

Extract Dialogue Data with LLM Gateway and JSON

Next
Built with

AssemblyAI’s LLM Gateway is a unified API providing access to 25+ models from Claude, GPT, Gemini, and more through a single interface. It’s a powerful way to extract insights from transcripts generated from audio and video files. Given how varied the type of input and output could be for these use cases, the pricing for LLM Gateway is based on both input and output tokens.

Output tokens will vary depending on the model and the complexity of your request, but how do you determine the amount of input tokens you’ll be sending to LLM Gateway? How many tokens does an audio file and your prompt contain? This guide will show you how to roughly calculate that information to help predict LLM Gateway’s input token cost ahead of time.

This guide calculates input token costs only. Output token costs will vary based on the model used and the length of the generated response.

To see the specific cost of each model (per 1M input and output tokens) applicable to your AssemblyAI account, refer to the Rates table on the Billing page of the dashboard.

Quickstart

Python
JavaScript
1import requests
2import time
3
4base_url = "https://api.assemblyai.com"
5headers = {"authorization": "YOUR_API_KEY"}
6
7# Transcribe audio file
8audio_url = "https://assembly.ai/wildfires.mp3"
9data = {"audio_url": audio_url, "speech_models": ["universal-3-pro"]}
10
11response = requests.post(base_url + "/v2/transcript", headers=headers, json=data)
12transcript_id = response.json()["id"]
13polling_endpoint = base_url + f"/v2/transcript/{transcript_id}"
14
15# Poll for completion
16print("Waiting for transcription to complete...")
17
18while True:
19 transcript = requests.get(polling_endpoint, headers=headers).json()
20 if transcript["status"] == "completed":
21 break
22 elif transcript["status"] == "error":
23 raise RuntimeError(f"Transcription failed: {transcript['error']}")
24 time.sleep(3)
25
26# Define your prompt
27prompt = "Provide a brief summary of the transcript."
28
29# Calculate character count (transcript + prompt)
30transcript_chars = len(transcript["text"])
31prompt_chars = len(prompt)
32total_chars = transcript_chars + prompt_chars
33print(f"\nTotal characters: {total_chars}")
34
35# Estimate tokens (roughly 4 characters = 1 token)
36estimated_tokens = total_chars / 4
37tokens_in_millions = estimated_tokens / 1_000_000
38
39# Calculate input costs for different models
40gpt5_cost = 1.25 * tokens_in_millions
41claude_sonnet_cost = 3.00 * tokens_in_millions
42gemini_pro_cost = 1.25 * tokens_in_millions
43
44print(f"Estimated input tokens: {estimated_tokens:,.0f}")
45print(f"\nEstimated input costs:")
46print(f"GPT-5: ${gpt5_cost:.4f}")
47print(f"Claude 4.5 Sonnet: ${claude_sonnet_cost:.4f}")
48print(f"Gemini 2.5 Pro: ${gemini_pro_cost:.4f}")

Step-by-Step Guide

Install dependencies

Install the required library:

Python
$pip install requests

Set up your API key

Import the necessary libraries and set your AssemblyAI API key, which can be found on your account dashboard:

Python
JavaScript
1import requests
2import time
3
4base_url = "https://api.assemblyai.com"
5headers = {"authorization": "YOUR_API_KEY"}

Transcribe your audio file

Transcribe your audio file using AssemblyAI:

Python
JavaScript
1audio_url = "https://assembly.ai/wildfires.mp3"
2data = {"audio_url": audio_url, "speech_models": ["universal-3-pro"]}
3
4response = requests.post(base_url + "/v2/transcript", headers=headers, json=data)
5transcript_id = response.json()["id"]
6polling_endpoint = base_url + f"/v2/transcript/{transcript_id}"
7
8# Poll for completion
9print("Waiting for transcription to complete...")
10
11while True:
12 transcript = requests.get(polling_endpoint, headers=headers).json()
13 if transcript["status"] == "completed":
14 break
15 elif transcript["status"] == "error":
16 raise RuntimeError(f"Transcription failed: {transcript['error']}")
17 time.sleep(3)

Calculate character count

We’ll count the characters in both the transcript and your prompt:

Python
JavaScript
1# Define your prompt
2prompt = "Provide a brief summary of the transcript."
3
4# Calculate character count (transcript + prompt)
5transcript_chars = len(transcript["text"])
6prompt_chars = len(prompt)
7total_chars = transcript_chars + prompt_chars
8print(f"\nTotal characters: {total_chars}")

For this specific file with the example prompt, the transcript contains approximately 4,880 characters and the prompt contains 42 characters, for a total of 4,922 characters.

Estimate tokens

Different LLM providers use different tokenization methods, but a rough estimate is that 4 characters equals approximately 1 token. This is based on guidance from:

  • Claude tokenization documentation
  • OpenAI token counting guide
  • Gemini tokens documentation
Python
JavaScript
1# Estimate tokens (roughly 4 characters = 1 token)
2estimated_tokens = total_chars / 4
3tokens_in_millions = estimated_tokens / 1_000_000
4
5print(f"Estimated input tokens: {estimated_tokens:,.0f}")
Language considerations

Token counts can differ significantly across languages. Non-English languages typically require more tokens per character than English. For instance, text in languages like Spanish, Chinese, or Arabic may use 2-3 characters per token instead of 4, resulting in higher token costs for the same amount of content.

Calculate input token costs

LLM Gateway’s pricing is calculated per 1M input tokens. Here are the current rates for popular models:

Python
JavaScript
1# Calculate input costs for different models (rates per 1M tokens)
2gpt5_cost = 1.25 * tokens_in_millions
3claude_sonnet_cost = 3.00 * tokens_in_millions
4gemini_pro_cost = 1.25 * tokens_in_millions
5
6print(f"\nEstimated input costs:")
7print(f"GPT-5: ${gpt5_cost:.4f}")
8print(f"Claude 4.5 Sonnet: ${claude_sonnet_cost:.4f}")
9print(f"Gemini 2.5 Pro: ${gemini_pro_cost:.4f}")

For our example file with approximately 1,230 input tokens:

  • GPT-5 (gpt-5): ~$0.0015
  • Claude 4.5 Sonnet (claude-sonnet-4-5-20250929): ~$0.0037
  • Gemini 2.5 Pro (gemini-2.5-pro): ~$0.0015

These calculations estimate input token costs only. Output tokens are not included and will vary based on:

  • The model you choose
  • The complexity of your request
  • The length of the generated response

To see the complete pricing for both input and output tokens for all available models, visit the Rates table on the Billing page of your dashboard.

Next steps

  • LLM Gateway Overview - Learn about all available models and capabilities
  • Apply LLM Gateway to Audio Transcripts - Complete guide to using LLM Gateway with transcripts
  • Billing page - View LLM pricing for your account