Skip to main content

What to log for support

Every LLM Gateway response includes a request_id — a unique identifier for that specific request. Log this ID for every call, not just when something goes wrong. When you reach out to support@assemblyai.com, including the request_id lets us find the exact request in our logs in seconds. At minimum, capture the following for every request:
  • request_id from the response body
  • The model parameter used
  • The API region (US: llm-gateway.assemblyai.com, EU: llm-gateway.eu.assemblyai.com)
  • A timestamp for when the request was sent
  • The full HTTP status code and response body when a non-2xx response is returned
A minimal logging example:
import requests
import time

response = requests.post(
    "https://llm-gateway.assemblyai.com/v1/chat/completions",
    headers={"authorization": "<YOUR_API_KEY>"},
    json={
        "model": "claude-sonnet-4-6",
        "messages": [{"role": "user", "content": "What is the capital of France?"}],
        "max_tokens": 1000,
    },
)

result = response.json()
log_entry = {
    "timestamp": time.time(),
    "region": "us",
    "model": "claude-sonnet-4-6",
    "status_code": response.status_code,
    "request_id": result.get("request_id"),
    "error": result.get("error"),
}
print(log_entry)

Authentication errors (401 / 403)

Symptom: The API responds with 401 Unauthorized or 403 Forbidden.
{
  "error": "Authentication error, API token missing/invalid",
  "status": "error",
  "request_id": "6e6f340d-6580-4cd4-a3f9-d6cb7323a9bd"
}
Causes:
  • API key is missing, malformed, or expired.
  • API key is from a different account or region.
  • The Authorization header is misspelled (e.g. Authorisation or missing the header entirely).
Fixes:
  • Confirm your API key on the API Keys page.
  • Pass the key in the Authorization header — not as a query parameter and not prefixed with Bearer.
  • If you’re using EU data residency, make sure the key was generated for the EU region. See Cloud endpoints and data residency.

Bad request (400)

Symptom: The API responds with 400 Bad Request.
{
  "code": 400,
  "message": "invalid request body",
  "request_id": "2a9adf03-c73e-4333-a42d-54b515e6afbd",
  "metadata": {
    "errors": [
      "one of messages or prompt required"
    ]
  }
}
Causes:
  • A required field is missing (model, plus either messages or prompt).
  • The model value is not a recognized model ID — see Available models.
  • max_tokens is outside the valid range or exceeds the model’s context window.
  • A field is the wrong type (e.g. messages sent as a string instead of an array).
Fixes:
  • Validate your request payload against the Basic chat completions reference.
  • Check the metadata.errors array in the response — it lists every field that failed validation.
Validation errors in detail The metadata.errors array lists every field that failed validation:
StringMeaning
"one of messages or prompt required"Neither messages nor prompt was provided
"model {model} is not supported"The model value is not a recognized model ID
"model context limit exceeded"The input exceeds the model’s context window
"model_region can only be set to global"model_region was set to a value other than "global"
"fallback_config depth cannot be greater than 2"fallback_config.depth exceeds the maximum of 2
"response_format is invalid: {detail}"The response_format object failed schema validation

Rate limit exceeded (429)

Symptom: The API responds with 429 Too Many Requests. Cause: You exceeded the per-model rate limit within a 60-second window. Each model has its own limit. Fixes:
  • Read the rate limit headers on every response (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) to back off gracefully. See Rate limits for the full header reference.
  • Implement exponential backoff with jitter when you receive a 429.
  • Consider specifying fallback models so traffic spills over to a different model when the primary is rate-limited.
  • If you need a higher rate limit, contact support.

Transcript not found (404)

Symptom: The API responds with 404 Not Found when you pass a transcript_id that can’t be found.
{
  "code": 404,
  "message": "transcript not found",
  "request_id": "bf08febb-ee48-4ce1-b473-7c7b15561033"
}
Causes:
  • The transcript_id belongs to a different account or was created under a different API key.
  • The transcript was deleted (by you or via a data retention policy). In that case the message will be "transcript deleted".
  • The transcript was created in a different region — US transcripts are not accessible from the EU endpoint and vice versa.
Fixes:
  • Confirm the transcript ID is correct and belongs to the account associated with the API key you’re using.
  • If the transcript was deleted, re-transcribe the audio and use the new transcript ID.
  • Make sure the LLM Gateway region matches the region where the transcript was created.

Server errors (5xx)

Symptom: The API responds with 500, 502, 503, or 504. Causes:
  • Transient issues on AssemblyAI’s side or with the upstream model provider.
  • The upstream provider returned a timeout or unavailable response.
Fixes:
  • Retry with exponential backoff and jitter. Most 5xx errors are transient.
  • Check the AssemblyAI Status page for ongoing incidents.
  • If the error persists, contact support with the request_id, the model used, the timestamp, and the full error response body.

Streamed responses don’t appear

Symptom: You set stream: true but receive a single non-streamed response — or no response at all. Causes:
  • Streaming is currently supported on OpenAI models only. Other providers ignore the stream flag and return a regular response.
  • The HTTP client isn’t reading the response body as a stream of server-sent events (SSE).
Fixes:

Unexpected output or quality issues

Symptom: The model returns content you didn’t expect — wrong format, wrong language, hallucinations, or refusals. Fixes:
  • Capture the full request payload (model, messages, parameters), the full response, and the request_id. Send all three to support@assemblyai.com — quality issues are difficult to diagnose without the exact prompt.
  • For structured output, use Structured outputs with a JSON schema rather than prompting for JSON in free text.
  • For malformed JSON, enable Post-processing to automatically repair responses.
  • Try a different model — quality varies. See the LMArena scores for a comparison.

Contacting support

If you’ve worked through the steps above and still need help, email support@assemblyai.com with:
  • The request_id from the failing response (or several, for intermittent issues)
  • The model parameter used
  • The API region (US or EU)
  • A timestamp for when the request was sent
  • The HTTP status code and full error response body
  • A minimal reproducible example of the request payload (with your API key redacted)