Multi-turn Conversations | AssemblyAI

Overview

Multi-turn conversations allow you to maintain context across multiple exchanges by including conversation history in your API requests. This enables the model to understand and reference previous messages, creating natural, coherent dialogues.

Why use multi-turn conversations?

With conversation history, you can:

Ask follow-up questions - Ask “What’s the population?” and the model knows you’re referring to Paris from the previous message
Build on previous responses - Request clarifications, expansions, or corrections without repeating context
Create interactive experiences - Build chatbots, assistants, and conversational interfaces that feel natural

How it works

Each API request includes an array of previous messages. The model uses this history to understand context and maintain coherence across the conversation:

1 # First exchange
2 messages = [
3     {"role": "user", "content": "What is the capital of France?"}
4 ]
5 # Response: "The capital of France is Paris."
6 
7 # Second exchange - model remembers Paris
8 messages = [
9     {"role": "user", "content": "What is the capital of France?"},
10     {"role": "assistant", "content": "The capital of France is Paris."},
11     {"role": "user", "content": "What's the population?"}
12 ]
13 # Response: "As of the latest estimates, the population of Paris is approximately 2.2 million..."

Note: You’re responsible for managing conversation history. Each request must include all relevant previous messages - the API doesn’t store history between requests.

Getting started

Maintain context by including conversation history:

Python

JavaScript

1 import requests
2 
3 headers = {
4   "authorization": "<YOUR_API_KEY>"
5 }
6 
7 conversation_history = [
8     {"role": "user", "content": "What is the capital of France?"},
9     {"role": "assistant", "content": "The capital of France is Paris."},
10     {"role": "user", "content": "What's the population?"}
11 ]
12 
13 response = requests.post(
14     "https://llm-gateway.assemblyai.com/v1/chat/completions",
15     headers = headers,
16     json = {
17         "model": "claude-sonnet-4-5-20250929",
18         "messages": conversation_history,
19         "max_tokens": 1000
20     }
21 )
22 
23 result = response.json()
24 agent_response = result["choices"][0]["message"]["content"]
25 print(agent_response)
26 
27 # Append the assistant's response to conversation history
28 conversation_history.append({"role": "assistant", "content": agent_response})

Message types

When building conversation history, you can use the following message types:

user - Messages from the user
assistant - Messages from the AI model
system - System instructions or context

Structure your conversation history with these message types to track the complete interaction flow between the user and model.

API reference

Request

The LLM Gateway accepts POST requests to https://llm-gateway.assemblyai.com/v1/chat/completions with the following parameters:

$ curl -X POST \
>   "https://llm-gateway.assemblyai.com/v1/chat/completions" \
>   -H "Authorization: YOUR_API_KEY" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "model": "claude-sonnet-4-5-20250929",
>     "messages": [
>       {
>         "role": "user",
>         "content": "What is the capital of France?"
>       },
>       {
>         "role": "assistant",
>         "content": "The capital of France is Paris."
>       },
>       {
>         "role": "user",
>         "content": "What'\''s the population?"
>       }
>     ],
>     "max_tokens": 1000
>   }'

Request parameters

Key	Type	Required?	Description
`model`	string	Yes	The model to use for completion. See Available models section for supported values.
`messages`	array	Yes	An array of message objects representing the conversation history.
`max_tokens`	number	No	The maximum number of tokens to generate. Range: [1, context_length).
`temperature`	number	No	Controls randomness in the output. Higher values make output more random. Range: [0, 2].

Message object

Key	Type	Required?	Description
`role`	string	Yes	The role of the message sender. Valid values: `"user"`, `"assistant"`, `"system"`, or `"tool"`.
`content`	string or array	Yes	The message content. Can be a string or an array of content parts for the `"user"` role.
`name`	string	No	An optional name for the message sender. For non-OpenAI models, this will be prepended as `{name}: {content}`.

Content part object

Key	Type	Required?	Description
`type`	string	Yes	The type of content. Currently only `"text"` is supported.
`text`	string	Yes	The text content.

Response

The API returns a JSON response with the model’s completion:

1 {
2   "request_id": "abc123",
3   "choices": [
4     {
5       "message": {
6         "role": "assistant",
7         "content": "As of the latest estimates, the population of Paris is approximately 2.2 million people within the city proper, and around 12 million in the greater metropolitan area."
8       },
9       "finish_reason": "stop"
10     }
11   ],
12   "request": {
13     "model": "claude-sonnet-4-5-20250929",
14     "max_tokens": 1000
15   },
16   "usage": {
17     "input_tokens": 45,
18     "output_tokens": 35,
19     "total_tokens": 80
20   }
21 }

Response fields

Key	Type	Description
`request_id`	string	A unique identifier for the request.
`choices`	array	An array of completion choices. Typically contains one choice.
`choices[i].message`	object	The message object containing the model’s response.
`choices[i].message.role`	string	The role of the message, typically `"assistant"`.
`choices[i].message.content`	string	The text content of the model’s response.
`choices[i].finish_reason`	string	The reason the model stopped generating. Common values: `"stop"`, `"length"`.
`request`	object	Echo of the request parameters (excluding `messages`).
`usage`	object	Token usage statistics for the request.
`usage.input_tokens`	number	Number of tokens in the prompt.
`usage.output_tokens`	number	Number of tokens in the completion.
`usage.total_tokens`	number	Total tokens used (prompt + completion).

Error response

If an error occurs, the API returns an error response:

1 {
2   "error": {
3     "code": 400,
4     "message": "Invalid request: missing required field 'model'",
5     "metadata": {}
6   }
7 }

Key	Type	Description
`error`	object	Container for error information.
`error.code`	number	HTTP status code for the error.
`error.message`	string	A human-readable description of the error.
`error.metadata`	object	Optional additional error context.

Common error codes

Code	Description
400	Bad Request - Invalid request parameters
401	Unauthorized - Invalid or missing API key
403	Forbidden - Insufficient permissions
404	Not Found - Invalid endpoint or model
429	Too Many Requests - Rate limit exceeded
500	Internal Server Error - Server-side error