Separating automatic language detection from transcription — AssemblyAI

In this guide, you’ll learn how to perform automatic language detection (ALD) separately from the transcription process. For the transcription, the file then gets routed to either the Best or Nano model class, depending on the supported language.

This workflow is designed to be cost-effective, slicing the first 60 seconds of audio and running it through Nano ALD, which detects 99 languages, at a cost of $0.002 per transcript for this language detection workflow (not including the total transcription cost).

Performing ALD with this workflow has a few benefits:

Cost-effective language detection
Ability to detect 99 languages
Ability to use Nano as fallback if the language is not supported in Best
Ability to enable Audio Intelligence models if the language is supported
Ability to use LeMUR with LLM prompts in Spanish for Spanish audio

Before you begin

To complete this tutorial, you need:

Python installed.
A free AssemblyAI account.

The entire source code of this guide can be viewed here.

Step-by-step instructions

Install the Python SDK:

$ pip install assemblyai

1 import assemblyai as aai
2 
3 aai.settings.api_key = "<YOUR_API_KEY>"

Create a set with all supported languages for Best. You can find them in our documentation here.

1 supported_languages_for_best = {
2     "en",
3     "es",
4     "fr",
5     "de",
6     "it",
7     "pt",
8     "nl",
9     "hi",
10     "ja",
11     "zh",
12     "fi",
13     "ko",
14     "pl",
15     "ru",
16     "tr",
17     "uk",
18     "vi",
19 }

Define a Transcriber. Note that here we don’t pass in a global TranscriptionConfig, but later apply different ones during the transcribe() call.

1 transcriber = aai.Transcriber()

Define two helper functions:

detect_language() performs language detection on the first 60 seconds of the audio using Nano and returns the language code.
transcribe_file() performs the transcription using Best or Nano depending on the identified language.

1 def detect_language(audio_url):
2     config = aai.TranscriptionConfig(
3         audio_end_at=60000,  # first 60 seconds (in milliseconds)
4         language_detection=True,
5         speech_model=aai.SpeechModel.nano,
6     )
7     transcript = transcriber.transcribe(audio_url, config=config)
8     return transcript.json_response["language_code"]
9 
10 def transcribe_file(audio_url, language_code):
11     config = aai.TranscriptionConfig(
12         language_code=language_code,
13         speech_model=(
14             aai.SpeechModel.best
15             if language_code in supported_languages_for_best
16             else aai.SpeechModel.nano
17         ),
18     )
19     transcript = transcriber.transcribe(audio_url, config=config)
20     return transcript

Test the code with different audio files. Apply both helper functions sequentially to each file to first identify the language and then transcribe the file.

1 audio_urls = [
2     "https://storage.googleapis.com/aai-web-samples/public_benchmarking_portugese.mp3",
3     "https://storage.googleapis.com/aai-web-samples/public_benchmarking_spanish.mp3",
4     "https://storage.googleapis.com/aai-web-samples/slovenian_luka_doncic_interview.mp3",
5     "https://storage.googleapis.com/aai-web-samples/5_common_sports_injuries.mp3",
6 ]
7 
8 for audio_url in audio_urls:
9     language_code = detect_language(audio_url)
10     print("Identified language:", language_code)
11 
12     transcript = transcribe_file(audio_url, language_code)
13     print("Transcript:", transcript.text[:100], "...")

Output:

$ Identified language: pt
> Transcript: e aí Olá pessoal, sejam bem-vindos a mais um vídeo e hoje eu vou ensinar-vos como fazer esta espada  ...
> Identified language: es
> Transcript: Precisamente sobre este caso, el diario estadounidense New York Times reveló este sábado un conjunto ...
> Identified language: sl
> Transcript: Ni lepška, kaj videl tega otroka v mrekoj svojga okolja, da mu je uspil in to v takimi miri, da pač  ...
> Identified language: en
> Transcript: Runner's knee runner's knee is a condition characterized by pain behind or around the kneecap. It is ...

1	import assemblyai as aai
2
3	aai.settings.api_key = "<YOUR_API_KEY>"

1	supported_languages_for_best = {
2	"en",
3	"es",
4	"fr",
5	"de",
6	"it",
7	"pt",
8	"nl",
9	"hi",
10	"ja",
11	"zh",
12	"fi",
13	"ko",
14	"pl",
15	"ru",
16	"tr",
17	"uk",
18	"vi",
19	}

1	def detect_language(audio_url):
2	config = aai.TranscriptionConfig(
3	audio_end_at=60000, # first 60 seconds (in milliseconds)
4	language_detection=True,
5	speech_model=aai.SpeechModel.nano,
6	)
7	transcript = transcriber.transcribe(audio_url, config=config)
8	return transcript.json_response["language_code"]
9
10	def transcribe_file(audio_url, language_code):
11	config = aai.TranscriptionConfig(
12	language_code=language_code,
13	speech_model=(
14	aai.SpeechModel.best
15	if language_code in supported_languages_for_best
16	else aai.SpeechModel.nano
17	),
18	)
19	transcript = transcriber.transcribe(audio_url, config=config)
20	return transcript

1	audio_urls = [
2	"https://storage.googleapis.com/aai-web-samples/public_benchmarking_portugese.mp3",
3	"https://storage.googleapis.com/aai-web-samples/public_benchmarking_spanish.mp3",
4	"https://storage.googleapis.com/aai-web-samples/slovenian_luka_doncic_interview.mp3",
5	"https://storage.googleapis.com/aai-web-samples/5_common_sports_injuries.mp3",
6	]
7
8	for audio_url in audio_urls:
9	language_code = detect_language(audio_url)
10	print("Identified language:", language_code)
11
12	transcript = transcribe_file(audio_url, language_code)
13	print("Transcript:", transcript.text[:100], "...")

$	Identified language: pt
>	Transcript: e aí Olá pessoal, sejam bem-vindos a mais um vídeo e hoje eu vou ensinar-vos como fazer esta espada ...
>	Identified language: es
>	Transcript: Precisamente sobre este caso, el diario estadounidense New York Times reveló este sábado un conjunto ...
>	Identified language: sl
>	Transcript: Ni lepška, kaj videl tega otroka v mrekoj svojga okolja, da mu je uspil in to v takimi miri, da pač ...
>	Identified language: en
>	Transcript: Runner's knee runner's knee is a condition characterized by pain behind or around the kneecap. It is ...