Translation

Global Englishen
Australian Englishen_au
British Englishen_uk
US Englishen_us
Spanishes
Frenchfr
Germande
Italianit
Portuguesept
Dutchnl
Hindihi
Japaneseja
Chinesezh
Finnishfi
Koreanko
Polishpl
Russianru
Turkishtr
Ukrainianuk
Vietnamesevi
Afrikaansaf
Albaniansq
Amharicam
Arabicar
Armenianhy
Assameseas
Azerbaijaniaz
Basqueeu
Belarusianbe
Bengalibn
Bosnianbs
Bulgarianbg
Catalanca
Croatianhr
Czechcs
Danishda
Estonianet
Galiciangl
Georgianka
Greekel
Gujaratigu
Haitianht
Hausaha
Hawaiianhaw
Hebrewhe
Hungarianhu
Icelandicis
Indonesianid
Javanesejw
Kannadakn
Kazakhkk
Laolo
Latinla
Latvianlv
Lithuanianlt
Luxembourgishlb
Macedonianmk
Malagasymg
Malayms
Malayalamml
Maltesemt
Maorimi
Marathimr
Mongolianmn
Nepaline
Norwegianno
Panjabipa
Pashtops
Persianfa
Romanianro
Serbiansr
Shonasn
Sindhisd
Sinhalasi
Slovaksk
Sloveniansl
Somaliso
Sundanesesu
Swahilisw
Swedishsv
Tagalogtl
Tajiktg
Tamilta
Telugute
Urduur
Uzbekuz
Welshcy
Yiddishyi
Yorubayo

Slam 1slam-1
Universaluniversal

US only

Overview

The Translation feature automatically converts your transcribed audio content from one language to another, enabling you to reach global audiences without manual translation work. You can translate transcripts into over 100 languages with a single API request.

Key capabilities:

  • Translate to multiple target languages simultaneously
  • Choose between formal and informal translation styles
  • Translate during transcription or add translations to existing transcripts
  • Get full-text translations that preserve the original meaning and context
  • Get per-speaker translated utterances when using Speaker Labels

Common use cases:

  • Creating multilingual subtitles for video content
  • Translating customer support calls for international teams
  • Localizing podcast episodes for different markets
  • Making educational content accessible in multiple languages
  • Generating multilingual meeting summaries

Quickstart

There are two ways to use Translation:

  1. Transcribe and translate in one request - Best when you’re starting a new transcription and want to automatically translate the transcript text as part of that process
  2. Transcribe and translate in separate requests - Best when you already have text that you would like to translate or for more complicated workflows where you want to separate the transcription and translation tasks

Method 1: Transcribe and translate in one request

This method is ideal when you’re starting fresh and want both transcription and translation in a single workflow.

1import requests
2import time
3
4base_url = "https://api.assemblyai.com"
5
6headers = {
7 "authorization": "YOUR_API_KEY"
8}
9
10# Need to transcribe a local file? Learn more here: https://www.assemblyai.com/docs/getting-started/transcribe-an-audio-file
11audio_url = "https://assembly.ai/wildfires.mp3"
12
13# Configure transcription with translation
14data = {
15 "audio_url": audio_url,
16 "speaker_labels": True, # Enable speaker labels
17 "speech_understanding": {
18 "request": {
19 "translation": {
20 "target_languages": ["es", "de"], # Translate to Spanish and German
21 "formal": True, # Use formal language style
22 "match_original_utterance": True # Get translated utterances
23 }
24 }
25 }
26}
27
28# Submit transcription request
29response = requests.post(base_url + "/v2/transcript", headers=headers, json=data)
30transcript_id = response.json()["id"]
31polling_endpoint = base_url + f"/v2/transcript/{transcript_id}"
32
33# Poll transcription results
34while True:
35 transcript = requests.get(polling_endpoint, headers=headers).json()
36
37 if transcript["status"] == "completed":
38 break
39
40 elif transcript["status"] == "error":
41 raise RuntimeError(f"Transcription failed: {transcript['error']}")
42
43 else:
44 time.sleep(3)
45
46# Access and display results
47print("\n--- Original Transcript ---")
48print(transcript['text'][:200] + "...\n")
49
50print("--- Translations ---")
51for language_code, translated_text in transcript['translated_texts'].items():
52 print(f"{language_code.upper()}:")
53 print(translated_text[:200] + "...\n")

Method 2: Transcribe and translate in separate requests

This method is useful when you already have text that you would like to translate or for more complicated workflows where you want to separate the transcription and translation tasks.

1import requests
2import time
3
4base_url = "https://api.assemblyai.com"
5
6headers = {
7 "authorization": "<YOUR_API_KEY>"
8}
9
10# Need to transcribe a local file? Learn more here: https://www.assemblyai.com/docs/getting-started/transcribe-an-audio-file
11audio_url = "https://assembly.ai/wildfires.mp3"
12
13# Submit transcription request (without translation)
14data = {
15 "audio_url": audio_url,
16 "speaker_labels": True
17}
18
19# Transcribe file
20response = requests.post(base_url + "/v2/transcript", headers=headers, json=data)
21transcript_id = response.json()["id"]
22polling_endpoint = base_url + f"/v2/transcript/{transcript_id}"
23
24# Poll for transcription completion
25while True:
26 transcript = requests.get(polling_endpoint, headers=headers).json()
27
28 if transcript["status"] == "completed":
29 print("Transcription completed!")
30 break
31
32 elif transcript["status"] == "error":
33 raise RuntimeError(f"Transcription failed: {transcript['error']}")
34
35 else:
36 time.sleep(3)
37
38# Add translation configuration to the completed transcript
39understanding_body = {
40 "transcript_id": transcript_id,
41 "speech_understanding": {
42 "request": {
43 "translation": {
44 "target_languages": ["es", "de"], # Translate to Spanish and German
45 "formal": True, # Use formal language style
46 "match_original_utterance": True # Get translated utterances (if speaker_labels was enabled)
47 }
48 }
49 }
50}
51
52# Send to Speech Understanding API for translation
53result = requests.post(
54 "https://llm-gateway.assemblyai.com/v1/understanding",
55 headers=headers,
56 json=understanding_body
57).json()
58
59# Access and display results
60print("\n--- Original Transcript ---")
61print(transcript['text'][:200] + "...\n")
62
63print("--- Translations ---")
64for language_code, translated_text in result['translated_texts'].items():
65 print(f"{language_code.upper()}:")
66 print(translated_text[:200] + "...\n")

Expected output:

--- Original Transcript ---
Smoke from hundreds of wildfires in Canada is triggering air quality alerts throughout the US...
--- Translations ---
ES:
El humo de cientos de incendios forestales en Canadá está provocando alertas de calidad del aire...
DE:
Rauch von Hunderten von Waldbränden in Kanada löst in den gesamten USA Luftqualitätswarnungen aus...

Output format

The Translation API returns translations in the translated_texts key of the response. This key contains an object where each property is a language code corresponding to one of your target languages, and the value is the full translated text.

Example response structure:

1{
2 "id": "735d90b6-2e8b-4748-b75d-d02b78eb7811",
3 "status": "completed",
4 "text": "Smoke from hundreds of wildfires in Canada is triggering air quality alerts...",
5 "translated_texts": {
6 "es": "El humo de cientos de incendios forestales en Canadá está provocando alertas de calidad del aire...",
7 "de": "Rauch von Hunderten von Waldbränden in Kanada löst in den gesamten USA Luftqualitätswarnungen aus..."
8 },
9 "speech_understanding": {
10 "request": {
11 "translation": {
12 "formal": true,
13 "target_languages": [
14 "es",
15 "de"
16 ],
17 "match_original_utterance": true
18 }
19 },
20 "response": {
21 "translation": {
22 "status": "success"
23 }
24 }
25 },
26 "utterances": [
27 {
28 "speaker": "A",
29 "text": "Smoke from hundreds of wildfires in Canada is triggering air quality alerts...",
30 "confidence": 0.9815734,
31 "start": 240,
32 "end": 26560,
33 "words": [
34 {
35 "text": "Smoke",
36 "start": 240,
37 "end": 640,
38 "confidence": 0.90152997,
39 "speaker": "A"
40 },
41 // ... more words
42 ],
43 "translated_texts": {
44 "es": "El humo de cientos de incendios forestales en Canadá está provocando alertas de calidad del aire...",
45 "de": "Rauch von Hunderten von Waldbränden in Kanada löst in den gesamten USA Luftqualitätswarnungen aus..."
46 }
47 },
48 // ... more utterances
49 ],
50 ...
51}

Translated utterances with Speaker Labels

When you enable both speaker_labels and set match_original_utterance to true, each utterance in the utterances array will include a translated_texts key containing translations for that specific speaker’s utterance. This is useful for creating speaker-specific subtitles or analyzing conversations in multiple languages.

Example utterance with translations:

1{
2 "speaker": "A",
3 "text": "Smoke from hundreds of wildfires in Canada is triggering air quality alerts...",
4 "confidence": 0.9815734,
5 "start": 240,
6 "end": 26560,
7 "words": [...],
8 "translated_texts": {
9 "es": "El humo de cientos de incendios forestales en Canadá está provocando alertas de calidad del aire...",
10 "de": "Rauch von Hunderten von Waldbränden in Kanada löst in den gesamten USA Luftqualitätswarnungen aus..."
11 }
12}

Each utterance’s translated_texts object follows the same structure as the top-level translated_texts, with language codes as keys and translated text as values.

API reference

Request

Method 1: Transcribe and translate in one request

When creating a new transcription, include the speech_understanding parameter directly in your transcription request:

$curl -X POST \
> "https://api.assemblyai.com/v2/transcript" \
> -H "Authorization: YOUR_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "audio_url": "https://assembly.ai/wildfires.mp3",
> "speaker_labels": true,
> "speech_understanding": {
> "request": {
> "translation": {
> "target_languages": ["es", "de"],
> "formal": true,
> "match_original_utterance": true
> }
> }
> }
> }'

Method 2: Add translation to existing transcripts

For existing transcripts, retrieve the completed transcript and send it to the Speech Understanding API:

$# Step 1: Get the completed transcript
>transcript=$(curl -s -X GET \
> "https://api.assemblyai.com/v2/transcript/YOUR_TRANSCRIPT_ID" \
> -H "Authorization: YOUR_API_KEY")
>
># Step 2: Add translation and send to Speech Understanding API
>curl -X POST \
> "https://llm-gateway.assemblyai.com/v1/understanding" \
> -H "Authorization: YOUR_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "transcript_id": "{transcript_id}",
> "speech_understanding": {
> "request": {
> "translation": {
> "target_languages": ["es", "de"],
> "formal": true,
> "match_original_utterance": true
> }
> }
> }
> }'
KeyTypeRequired?Description
speech_understandingobjectYesContainer for speech understanding requests.
speech_understanding.requestobjectYesThe understanding request configuration.
speech_understanding.request.translationobjectYesTranslation configuration.
translation.target_languagesarrayYesArray of language codes to translate the transcript into. See the supported languages table for available language codes.
translation.formalbooleanNoWhether to use formal language in translations. Defaults to false. When true, uses formal pronouns and grammatical forms.
translation.match_original_utterancebooleanNoWhether to include translated texts for each utterance. Defaults to false. When true, returns a translated_texts key within each utterance in the utterances array. Requires speaker_labels to be set to true in the request.

Response

The Translation API returns your original transcript response with an additional translated_texts key containing the translations. When match_original_utterance is enabled with speaker_labels, each utterance in the utterances array will also include its own translated_texts key.

1{
2 "id": "735d90b6-2e8b-4748-b75d-d02b78eb7811",
3 "status": "completed",
4 "text": "Smoke from hundreds of wildfires in Canada is triggering air quality alerts throughout the US...",
5 "translated_texts": {
6 "es": "El humo de cientos de incendios forestales en Canadá está provocando alertas de calidad del aire en todo Estados Unidos...",
7 "de": "Rauch von Hunderten von Waldbränden in Kanada löst in den gesamten USA Luftqualitätswarnungen aus..."
8 },
9 "speech_understanding": {
10 "request": {
11 "translation": {
12 "formal": true,
13 "target_languages": ["es", "de"]
14 }
15 },
16 "response": {
17 "translation": {
18 "status": "success"
19 }
20 }
21 }
22}
KeyTypeDescription
translated_textsobjectAn object containing the translated texts, where each key is a language code and each value is the full translated transcript text.
utterances[].translated_textsobject(When match_original_utterance is true) An object containing the translations for this specific utterance, with language codes as keys.
speech_understandingobjectContainer for speech understanding request and response information.
speech_understanding.requestobjectThe original translation request configuration that was submitted.
speech_understanding.request.translationobjectThe translation parameters that were used.
speech_understanding.responseobjectThe response information from the translation process.
speech_understanding.response.translationobjectStatus information about the translation.
speech_understanding.response.translation.statusstringThe status of the translation. Will be "success" when translation completes successfully.

Key differences from standard transcription

FieldStandard TranscriptionWith Translation
translated_textsNot presentObject with language codes as keys and translated texts as values
speech_understandingNot presentObject containing the translation request and response details

All other fields from the original transcript (text, words, utterances, confidence, etc.) remain unchanged.