Create a chat completion

<Note>To use our EU server for LLM Gateway, replace `llm-gateway.assemblyai.com` with `llm-gateway.eu.assemblyai.com`.</Note> Generates a response from a model given a prompt or a series of messages.

Authentication

Authorizationstring
API Key authentication via header

Request

Request body for creating a chat completion.
modelstringRequired

The ID of the model to use for this request. See LLM Gateway Overview for available models.

messageslist of objectsOptional
A list of messages comprising the conversation so far.
promptstringOptional
A simple string prompt. The API will automatically convert this into a user message.
max_tokensintegerOptional>=1Defaults to 1000
The maximum number of tokens to generate in the completion. Default is 1000.
temperaturedoubleOptional
Controls randomness. Lower values produce more deterministic results.
streambooleanOptionalDefaults to false

When true, responses are streamed as server-sent events (SSE). Supported on OpenAI models only.

toolslist of objectsOptional
A list of tools the model may call.
tool_choiceenum or objectOptional

Controls which (if any) function is called by the model.

response_formatobjectOptional

Specifies the format of the model’s response. Use this to constrain the model to output valid JSON matching a schema. Supported by OpenAI (GPT-4.1, GPT-5.x), Gemini, and Claude models. Not supported by gpt-oss models.

Response

Successful response containing the model's choices.
request_idstring or nullformat: "uuid"
choiceslist of objects or null
requestobject or null

A copy of the original request, excluding prompt and messages.

usageobject or null
http_status_codeinteger or null
The HTTP status code of the response
response_timeinteger or null
The response time in nanoseconds
llm_status_codeinteger or null
The status code from the LLM provider