What to log for support
Every LLM Gateway response includes arequest_id — a unique identifier for that specific request. Log this ID for every call, not just when something goes wrong. When you reach out to support@assemblyai.com, including the request_id lets us find the exact request in our logs in seconds.
At minimum, capture the following for every request:
request_idfrom the response body- The
modelparameter used - The API region (US:
llm-gateway.assemblyai.com, EU:llm-gateway.eu.assemblyai.com) - A timestamp for when the request was sent
- The full HTTP status code and response body when a non-2xx response is returned
- Python
- JavaScript
Authentication errors (401 / 403)
Symptom: The API responds with401 Unauthorized or 403 Forbidden.
- API key is missing, malformed, or expired.
- API key is from a different account or region.
- The
Authorizationheader is misspelled (e.g.Authorisationor missing the header entirely).
- Confirm your API key on the API Keys page.
- Pass the key in the
Authorizationheader — not as a query parameter and not prefixed withBearer. - If you’re using EU data residency, make sure the key was generated for the EU region. See Cloud endpoints and data residency.
Bad request (400)
Symptom: The API responds with400 Bad Request.
- A required field is missing (
model, plus eithermessagesorprompt). - The
modelvalue is not a recognized model ID — see Available models. max_tokensis outside the valid range or exceeds the model’s context window.- A field is the wrong type (e.g.
messagessent as a string instead of an array).
- Validate your request payload against the Basic chat completions reference.
- Check the
metadata.errorsarray in the response — it lists every field that failed validation.
metadata.errors array lists every field that failed validation:
| String | Meaning |
|---|---|
"one of messages or prompt required" | Neither messages nor prompt was provided |
"model {model} is not supported" | The model value is not a recognized model ID |
"model context limit exceeded" | The input exceeds the model’s context window |
"model_region can only be set to global" | model_region was set to a value other than "global" |
"fallback_config depth cannot be greater than 2" | fallback_config.depth exceeds the maximum of 2 |
"response_format is invalid: {detail}" | The response_format object failed schema validation |
Rate limit exceeded (429)
Symptom: The API responds with429 Too Many Requests.
Cause: You exceeded the per-model rate limit within a 60-second window. Each model has its own limit.
Fixes:
- Read the rate limit headers on every response (
X-RateLimit-Limit,X-RateLimit-Remaining,X-RateLimit-Reset) to back off gracefully. See Rate limits for the full header reference. - Implement exponential backoff with jitter when you receive a 429.
- Consider specifying fallback models so traffic spills over to a different model when the primary is rate-limited.
- If you need a higher rate limit, contact support.
Transcript not found (404)
Symptom: The API responds with404 Not Found when you pass a transcript_id that can’t be found.
- The
transcript_idbelongs to a different account or was created under a different API key. - The transcript was deleted (by you or via a data retention policy). In that case the message will be
"transcript deleted". - The transcript was created in a different region — US transcripts are not accessible from the EU endpoint and vice versa.
- Confirm the transcript ID is correct and belongs to the account associated with the API key you’re using.
- If the transcript was deleted, re-transcribe the audio and use the new transcript ID.
- Make sure the LLM Gateway region matches the region where the transcript was created.
Server errors (5xx)
Symptom: The API responds with500, 502, 503, or 504.
Causes:
- Transient issues on AssemblyAI’s side or with the upstream model provider.
- The upstream provider returned a timeout or unavailable response.
- Retry with exponential backoff and jitter. Most 5xx errors are transient.
- Check the AssemblyAI Status page for ongoing incidents.
- If the error persists, contact support with the
request_id, the model used, the timestamp, and the full error response body.
Streamed responses don’t appear
Symptom: You setstream: true but receive a single non-streamed response — or no response at all.
Causes:
- Streaming is currently supported on OpenAI models only. Other providers ignore the
streamflag and return a regular response. - The HTTP client isn’t reading the response body as a stream of server-sent events (SSE).
- Confirm the model is from OpenAI. See Available models.
- Use a client that reads SSE chunks (e.g.
response.iter_lines()in Pythonrequests, or the streamingfetchbody reader in JavaScript). See Basic chat completions — Streamed responses.
Unexpected output or quality issues
Symptom: The model returns content you didn’t expect — wrong format, wrong language, hallucinations, or refusals. Fixes:- Capture the full request payload (model, messages, parameters), the full response, and the
request_id. Send all three to support@assemblyai.com — quality issues are difficult to diagnose without the exact prompt. - For structured output, use Structured outputs with a JSON schema rather than prompting for JSON in free text.
- For malformed JSON, enable Post-processing to automatically repair responses.
- Try a different model — quality varies. See the LMArena scores for a comparison.
Contacting support
If you’ve worked through the steps above and still need help, email support@assemblyai.com with:- The
request_idfrom the failing response (or several, for intermittent issues) - The
modelparameter used - The API region (US or EU)
- A timestamp for when the request was sent
- The HTTP status code and full error response body
- A minimal reproducible example of the request payload (with your API key redacted)