Translation
Supported languages
en
en_au
en_uk
en_us
es
fr
de
it
pt
nl
hi
ja
zh
fi
ko
pl
ru
tr
uk
vi
af
sq
am
ar
hy
as
az
eu
be
bn
bs
bg
ca
hr
cs
da
et
gl
ka
el
gu
ht
ha
haw
he
hu
is
id
jw
kn
kk
lo
la
lv
lt
lb
mk
mg
ms
ml
mt
mi
mr
mn
ne
no
pa
ps
fa
ro
sr
sn
sd
si
sk
sl
so
su
sw
sv
tl
tg
ta
te
ur
uz
cy
yi
yo
Supported models
slam-1
universal
Supported regions
US only
Overview
The Translation feature automatically converts your transcribed audio content from one language to another, enabling you to reach global audiences without manual translation work. You can translate transcripts into over 100 languages with a single API request.
Key capabilities:
- Translate to multiple target languages simultaneously
- Choose between formal and informal translation styles
- Translate during transcription or add translations to existing transcripts
- Get full-text translations that preserve the original meaning and context
- Get per-speaker translated utterances when using Speaker Labels
Common use cases:
- Creating multilingual subtitles for video content
- Translating customer support calls for international teams
- Localizing podcast episodes for different markets
- Making educational content accessible in multiple languages
- Generating multilingual meeting summaries
Quickstart
There are two ways to use Translation:
- Transcribe and translate in one request - Best when you’re starting a new transcription and want to automatically translate the transcript text as part of that process
- Transcribe and translate in separate requests - Best when you already have text that you would like to translate or for more complicated workflows where you want to separate the transcription and translation tasks
Method 1: Transcribe and translate in one request
This method is ideal when you’re starting fresh and want both transcription and translation in a single workflow.
Python
JavaScript
Method 2: Transcribe and translate in separate requests
This method is useful when you already have text that you would like to translate or for more complicated workflows where you want to separate the transcription and translation tasks.
Python
JavaScript
Expected output:
Output format
The Translation API returns translations in the translated_texts
key of the response. This key contains an object where each property is a language code corresponding to one of your target languages, and the value is the full translated text.
Example response structure:
Translated utterances with Speaker Labels
When you enable both speaker_labels
and set match_original_utterance
to true
, each utterance in the utterances
array will include a translated_texts
key containing translations for that specific speaker’s utterance. This is useful for creating speaker-specific subtitles or analyzing conversations in multiple languages.
Example utterance with translations:
Each utterance’s translated_texts
object follows the same structure as the top-level translated_texts
, with language codes as keys and translated text as values.
API reference
Request
Method 1: Transcribe and translate in one request
When creating a new transcription, include the speech_understanding
parameter directly in your transcription request:
Method 2: Add translation to existing transcripts
For existing transcripts, retrieve the completed transcript and send it to the Speech Understanding API:
Response
The Translation API returns your original transcript response with an additional translated_texts
key containing the translations. When match_original_utterance
is enabled with speaker_labels
, each utterance in the utterances
array will also include its own translated_texts
key.
Key differences from standard transcription
All other fields from the original transcript (text
, words
, utterances
, confidence
, etc.) remain unchanged.