Automatic language detection

Supported languages

Automatic Language Detection is supported for all languages

Identify the dominant language spoken in an audio file and use it during the transcription. Enable it to detect any of the supported languages.

To reliably identify the dominant language, the file must contain at least 50 seconds of spoken audio.

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
4
5# audio_file = "./local_file.mp3"
6audio_file = "https://assembly.ai/wildfires.mp3"
7
8config = aai.TranscriptionConfig(language_detection=True)
9
10transcript = aai.Transcriber(config=config).transcribe(audio_file)
11
12print(transcript.text)
13print(transcript.json_response["language_code"])
Select model class based on detected language

By performing automatic language detection on a small chunk of audio first, you can then select between the Best or Nano model depending on the detected language. To learn more, see Separating automatic language detection from transcription.

Confidence score

If language detection is enabled, the API returns a confidence score for the detected language. The score ranges from 0.0 (low confidence) to 1.0 (high confidence).

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
4
5# audio_file = "./local_file.mp3"
6audio_file = "https://assembly.ai/wildfires.mp3"
7
8config = aai.TranscriptionConfig(language_detection=True)
9
10transcript = aai.Transcriber(config=config).transcribe(audio_file)
11
12print(transcript.text)
13print(transcript.json_response["language_confidence"])

Set a language confidence threshold

You can set the confidence threshold that must be reached if language detection is enabled. An error will be returned if the language confidence is below this threshold. Valid values are in the range [0,1] inclusive.

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
4
5# audio_file = "./local_file.mp3"
6audio_file = "https://assembly.ai/wildfires.mp3"
7
8config = aai.TranscriptionConfig(language_detection=True, language_confidence_threshold=0.8)
9
10transcript = aai.Transcriber(config=config).transcribe(audio_file)
11
12if transcript.status == "error":
13 raise RuntimeError(f"Transcription failed: {transcript.error}")
Fallback to a default language

For a workflow that resubmits a transcription request using a default language if the threshold is not reached, see this cookbook.