Translation
Supported languages
enen_auen_uken_usesfrdeitptnlhijazhfikoplrutrukviafsqamarhyasazeubebnbsbgcahrcsdaetglkaelguhthahawhehuisidjwknkklolalvltlbmkmgmsmlmtmimrmnnenopapsfarosrsnsdsiskslsosuswsvtltgtateuruzcyyiyoSupported models
slam-1universalSupported regions
US only
Overview
The Translation feature automatically converts your transcribed audio content from one language to another, enabling you to reach global audiences without manual translation work. You can translate transcripts into over 100 languages with a single API request.
Key capabilities:
- Translate to multiple target languages simultaneously
- Choose between formal and informal translation styles
- Translate during transcription or add translations to existing transcripts
- Get full-text translations that preserve the original meaning and context
- Get per-speaker translated utterances when using Speaker Labels
Common use cases:
- Creating multilingual subtitles for video content
- Translating customer support calls for international teams
- Localizing podcast episodes for different markets
- Making educational content accessible in multiple languages
- Generating multilingual meeting summaries
Quickstart
There are two ways to use Translation:
- Transcribe and translate in one request - Best when you’re starting a new transcription and want to automatically translate the transcript text as part of that process
- Transcribe and translate in separate requests - Best when you already have text that you would like to translate or for more complicated workflows where you want to separate the transcription and translation tasks
Method 1: Transcribe and translate in one request
This method is ideal when you’re starting fresh and want both transcription and translation in a single workflow.
Python
JavaScript
Method 2: Transcribe and translate in separate requests
This method is useful when you already have text that you would like to translate or for more complicated workflows where you want to separate the transcription and translation tasks.
Python
JavaScript
Expected output:
Output format
The Translation API returns translations in the translated_texts key of the response. This key contains an object where each property is a language code corresponding to one of your target languages, and the value is the full translated text.
Example response structure:
Translation with speaker labels
When you use Translation with Speaker Labels, you can get translated text for each individual utterance by setting match_original_utterance to true. This is useful for creating speaker-specific subtitles or analyzing conversations in multiple languages while preserving speaker attribution.
Python
JavaScript
Example response:
Each utterance in the utterances array includes a translated_texts object with the translation for that specific speaker’s utterance:
API reference
Request
Method 1: Transcribe and translate in one request
When creating a new transcription, include the speech_understanding parameter directly in your transcription request:
Method 2: Add translation to existing transcripts
For existing transcripts, retrieve the completed transcript and send it to the Speech Understanding API:
Response
The Translation API returns your original transcript response with an additional translated_texts key containing the translations. When match_original_utterance is enabled with speaker_labels, each utterance in the utterances array will also include its own translated_texts key.
Key differences from standard transcription
All other fields from the original transcript (text, words, utterances, confidence, etc.) remain unchanged.