Troubleshooting
What to log for support
Every LLM Gateway response includes a request_id — a unique identifier for that specific request. Log this ID for every call, not just when something goes wrong. When you reach out to support@assemblyai.com, including the request_id lets us find the exact request in our logs in seconds.
At minimum, capture the following for every request:
request_idfrom the response body- The
modelparameter used - The API region (US:
llm-gateway.assemblyai.com, EU:llm-gateway.eu.assemblyai.com) - A timestamp for when the request was sent
- The full HTTP status code and response body when a non-2xx response is returned
A minimal logging example:
Python
JavaScript
Authentication errors (401 / 403)
Symptom: The API responds with 401 Unauthorized or 403 Forbidden.
Causes:
- API key is missing, malformed, or expired.
- API key is from a different account or region.
- The
Authorizationheader is misspelled (e.g.Authorisationor missing the header entirely).
Fixes:
- Confirm your API key on the API Keys page.
- Pass the key in the
Authorizationheader — not as a query parameter and not prefixed withBearer. - If you’re using EU data residency, make sure the key was generated for the EU region. See Cloud endpoints and data residency.
Bad request (400)
Symptom: The API responds with 400 Bad Request.
Causes:
- A required field is missing (
model, plus eithermessagesorprompt). - The
modelvalue is not a supported model parameter — see Available models. max_tokensis outside the valid range or exceeds the model’s context window.- A field is the wrong type (e.g.
messagessent as a string instead of an array).
Fixes:
- Validate your request payload against the Basic chat completions reference.
- Echo the full error message — it includes the specific field that failed validation.
Rate limit exceeded (429)
Symptom: The API responds with 429 Too Many Requests.
Cause: You exceeded the per-model rate limit within a 60-second window. Each model has its own limit.
Fixes:
- Read the rate limit headers on every response (
X-RateLimit-Limit,X-RateLimit-Remaining,X-RateLimit-Reset) to back off gracefully. See Rate limits for the full header reference. - Implement exponential backoff with jitter when you receive a 429.
- Consider specifying fallback models so traffic spills over to a different model when the primary is rate-limited.
- If you need a higher rate limit, contact support.
Model not found (404)
Symptom: The API responds with 404 Not Found and an error mentioning the model.
Causes:
- The
modelvalue is misspelled or has been deprecated. - The model isn’t available in the region you’re calling. For example, OpenAI models are only available in the US region — see Cloud endpoints and data residency.
Fixes:
- Double-check the exact model parameter against Available models.
- If you need EU data residency, switch to an EU-supported model (most Anthropic Claude and Google Gemini models).
Server errors (5xx)
Symptom: The API responds with 500, 502, 503, or 504.
Causes:
- Transient issues on AssemblyAI’s side or with the upstream model provider.
- The upstream provider returned a timeout or unavailable response.
Fixes:
- Retry with exponential backoff and jitter. Most 5xx errors are transient.
- Check the AssemblyAI Status page for ongoing incidents.
- If the error persists, contact support with the
request_id, the model used, the timestamp, and the full error response body.
Streamed responses don’t appear
Symptom: You set stream: true but receive a single non-streamed response — or no response at all.
Causes:
- Streaming is currently supported on OpenAI models only. Other providers ignore the
streamflag and return a regular response. - The HTTP client isn’t reading the response body as a stream of server-sent events (SSE).
Fixes:
- Confirm the model is from OpenAI. See Available models.
- Use a client that reads SSE chunks (e.g.
response.iter_lines()in Pythonrequests, or the streamingfetchbody reader in JavaScript). See Basic chat completions — Streamed responses.
Unexpected output or quality issues
Symptom: The model returns content you didn’t expect — wrong format, wrong language, hallucinations, or refusals.
Fixes:
- Capture the full request payload (model, messages, parameters), the full response, and the
request_id. Send all three to support@assemblyai.com — quality issues are difficult to diagnose without the exact prompt. - For structured output, use Structured outputs with a JSON schema rather than prompting for JSON in free text.
- For malformed JSON, enable Post-processing to automatically repair responses.
- Try a different model — quality varies. See the LMArena scores for a comparison.
Contacting support
If you’ve worked through the steps above and still need help, email support@assemblyai.com with:
- The
request_idfrom the failing response (or several, for intermittent issues) - The
modelparameter used - The API region (US or EU)
- A timestamp for when the request was sent
- The HTTP status code and full error response body
- A minimal reproducible example of the request payload (with your API key redacted)