The Translation feature automatically converts your transcribed audio content from one language to another, enabling you to reach global audiences without manual translation work. You can translate transcripts into over 100 languages with a single API request.Key capabilities:
Translate to multiple target languages simultaneously
Choose between formal and informal translation styles
Translate during transcription or add translations to existing transcripts
Get full-text translations that preserve the original meaning and context
Get per-speaker translated utterances when using Speaker Labels
Common use cases:
Creating multilingual subtitles for video content
Translating customer support calls for international teams
Localizing podcast episodes for different markets
Making educational content accessible in multiple languages
Transcribe and translate in one request - Best when you’re starting a new transcription and want to automatically translate the transcript text as part of that process
Transcribe and translate in separate requests - Best when you already have text that you would like to translate or for more complicated workflows where you want to separate the transcription and translation tasks
Method 2: Transcribe and translate in separate requests
This method is useful when you already have text that you would like to translate or for more complicated workflows where you want to separate the transcription and translation tasks.
Python
JavaScript
import requestsimport timebase_url = "https://api.assemblyai.com"headers = { "authorization": "<YOUR_API_KEY>"}# Need to transcribe a local file? Learn more here: https://www.assemblyai.com/docs/getting-started/transcribe-an-audio-fileaudio_url = "https://assembly.ai/wildfires.mp3"# Submit transcription request (without translation)data = { "audio_url": audio_url, "speech_models": ["universal-3-pro", "universal-2"], "language_detection": True, "speaker_labels": True,}# Transcribe fileresponse = requests.post(base_url + "/v2/transcript", headers=headers, json=data)transcript_id = response.json()["id"]polling_endpoint = base_url + f"/v2/transcript/{transcript_id}"# Poll for transcription completionwhile True: transcript = requests.get(polling_endpoint, headers=headers).json() if transcript["status"] == "completed": print("Transcription completed!") break elif transcript["status"] == "error": raise RuntimeError(f"Transcription failed: {transcript['error']}") else: time.sleep(3)# Add translation configuration to the completed transcriptunderstanding_body = { "transcript_id": transcript_id, "speech_understanding": { "request": { "translation": { "target_languages": ["es", "de"], # Translate to Spanish and German "formal": True # Use formal language style } } }}# Send to Speech Understanding API for translationresult = requests.post( "https://llm-gateway.assemblyai.com/v1/understanding", headers=headers, json=understanding_body).json()# Access and display resultsprint("\n--- Original Transcript ---")print(transcript['text'][:200] + "...\n")print("--- Translations ---")for language_code, translated_text in result['translated_texts'].items(): print(f"{language_code.upper()}:") print(translated_text[:200] + "...\n")
const baseUrl = "https://api.assemblyai.com";const headers = { authorization: "<YOUR_API_KEY>", "content-type": "application/json",};// Need to transcribe a local file? Learn more here: https://www.assemblyai.com/docs/getting-started/transcribe-an-audio-fileconst audioUrl = "https://assembly.ai/wildfires.mp3";// Helper function to sleepconst sleep = (ms) => new Promise((resolve) => setTimeout(resolve, ms));async function main() { // Submit transcription request (without translation) const data = { audio_url: audioUrl, speech_models: ["universal-3-pro", "universal-2"], language_detection: true, speaker_labels: true, }; // Transcribe file const response = await fetch(`${baseUrl}/v2/transcript`, { method: "POST", headers: headers, body: JSON.stringify(data), }); const responseData = await response.json(); const transcriptId = responseData.id; const pollingEndpoint = `${baseUrl}/v2/transcript/${transcriptId}`; // Poll for transcription completion let transcript; while (true) { const pollResponse = await fetch(pollingEndpoint, { method: "GET", headers: headers, }); transcript = await pollResponse.json(); if (transcript.status === "completed") { console.log("Transcription completed!"); break; } else if (transcript.status === "error") { throw new Error(`Transcription failed: ${transcript.error}`); } else { await sleep(3000); } } // Add translation configuration to the completed transcript const understandingBody = { transcript_id: transcriptId, speech_understanding: { request: { translation: { target_languages: ["es", "de"], // Translate to Spanish and German formal: true, // Use formal language style }, }, }, }; // Send to Speech Understanding API for translation const resultResponse = await fetch( "https://llm-gateway.assemblyai.com/v1/understanding", { method: "POST", headers: headers, body: JSON.stringify(understandingBody), } ); const result = await resultResponse.json(); // Access and display results console.log("\n--- Original Transcript ---"); console.log(transcript.text.substring(0, 200) + "...\n"); console.log("--- Translations ---"); for (const [languageCode, translatedText] of Object.entries( result.translated_texts )) { console.log(`${languageCode.toUpperCase()}:`); console.log(translatedText.substring(0, 200) + "...\n"); }}// Run the main functionmain().catch((error) => { console.error("Error:", error);});
Expected output:
--- Original Transcript ---Smoke from hundreds of wildfires in Canada is triggering air quality alerts throughout the US...--- Translations ---ES:El humo de cientos de incendios forestales en Canadá está provocando alertas de calidad del aire...DE:Rauch von Hunderten von Waldbränden in Kanada löst in den gesamten USA Luftqualitätswarnungen aus...
The Translation API returns translations in the translated_texts key of the response. This key contains an object where each property is a language code corresponding to one of your target languages, and the value is the full translated text.Example response structure:
{ "id": "735d90b6-2e8b-4748-b75d-d02b78eb7811", "status": "completed", "text": "Smoke from hundreds of wildfires in Canada is triggering air quality alerts...", "translated_texts": { "es": "El humo de cientos de incendios forestales en Canadá está provocando alertas de calidad del aire...", "de": "Rauch von Hunderten von Waldbränden in Kanada löst in den gesamten USA Luftqualitätswarnungen aus..." }, "speech_understanding": { "request": { "translation": { "formal": true, "target_languages": [ "es", "de" ], } }, "response": { "translation": { "status": "success" } } }, "utterances": [ { "speaker": "A", "text": "Smoke from hundreds of wildfires in Canada is triggering air quality alerts...", "confidence": 0.9815734, "start": 240, "end": 26560, "words": [ { "text": "Smoke", "start": 240, "end": 640, "confidence": 0.90152997, "speaker": "A" }, // ... more words ], "translated_texts": { "es": "El humo de cientos de incendios forestales en Canadá está provocando alertas de calidad del aire...", "de": "Rauch von Hunderten von Waldbränden in Kanada löst in den gesamten USA Luftqualitätswarnungen aus..." } }, // ... more utterances ], ...}
When you use Translation with Speaker Labels, you can get translated text for each individual utterance by setting match_original_utterance to true. This is useful for creating speaker-specific subtitles or analyzing conversations in multiple languages while preserving speaker attribution.
For existing transcripts, retrieve the completed transcript and send it to the Speech Understanding API:
# Step 1: Get the completed transcripttranscript=$(curl -s -X GET \ "https://api.assemblyai.com/v2/transcript/YOUR_TRANSCRIPT_ID" \ -H "Authorization: YOUR_API_KEY")# Step 2: Add translation and send to Speech Understanding APIcurl -X POST \ "https://llm-gateway.assemblyai.com/v1/understanding" \ -H "Authorization: YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "transcript_id": "{transcript_id}", "speech_understanding": { "request": { "translation": { "target_languages": ["es", "de"], "formal": true } } } }'
Key
Type
Required?
Description
speech_understanding
object
Yes
Container for speech understanding requests.
speech_understanding.request
object
Yes
The understanding request configuration.
speech_understanding.request.translation
object
Yes
Translation configuration.
translation.target_languages
array
Yes
Array of language codes to translate the transcript into. See the supported languages table for available language codes.
translation.formal
boolean
No
Whether to use formal language in translations. Defaults to false. When true, uses formal pronouns and grammatical forms.
translation.match_original_utterance
boolean
No
Whether to include translated texts for each utterance. Defaults to false. When true, returns a translated_texts key within each utterance in the utterances array. Requires speaker_labels to be set to true in the request.
The Translation API returns your original transcript response with an additional translated_texts key containing the translations. When match_original_utterance is enabled with speaker_labels, each utterance in the utterances array will also include its own translated_texts key.
{ "id": "735d90b6-2e8b-4748-b75d-d02b78eb7811", "status": "completed", "text": "Smoke from hundreds of wildfires in Canada is triggering air quality alerts throughout the US...", "translated_texts": { "es": "El humo de cientos de incendios forestales en Canadá está provocando alertas de calidad del aire en todo Estados Unidos...", "de": "Rauch von Hunderten von Waldbränden in Kanada löst in den gesamten USA Luftqualitätswarnungen aus..." }, "speech_understanding": { "request": { "translation": { "formal": true, "target_languages": ["es", "de"] } }, "response": { "translation": { "status": "success" } } }}
Key
Type
Description
translated_texts
object
An object containing the translated texts, where each key is a language code and each value is the full translated transcript text.
utterances[].translated_texts
object
(When match_original_utterance is true) An object containing the translations for this specific utterance, with language codes as keys.
speech_understanding
object
Container for speech understanding request and response information.
speech_understanding.request
object
The original translation request configuration that was submitted.
speech_understanding.request.translation
object
The translation parameters that were used.
speech_understanding.response
object
The response information from the translation process.
speech_understanding.response.translation
object
Status information about the translation.
speech_understanding.response.translation.status
string
The status of the translation. Will be "success" when translation completes successfully.