Quickstart

First, install the assemblyai python package.

$ pip install assemblyai

Set your AssemblyAI API key as an environment variable named ASSEMBLYAI_API_KEY. You can get a free AssemblyAI API key from the AssemblyAI dashboard.

$ # Mac/Linux:
> export ASSEMBLYAI_API_KEY=<YOUR_API_KEY>
> 
> # Windows:
> set ASSEMBLYAI_API_KEY=<YOUR_API_KEY>

To load and transcribe audio data into documents,
Configure the file_path argument with a URL or a local file path to an audio or video file.

1 from llama_hub.assemblyai
2 
3 audio_file = "https://assembly.ai/nbc.mp3"
4 # or a local file path: audio_file = "./nbc.mp3"
5 
6 reader = AssemblyAIAudioTranscriptReader(file_path=audio_file)
7 
8 docs = reader.load_data()

reader.load_data() waits until the transcription is ready.

The reader.load_data() method returns an array of documents, but by default, there’s only one document in the array with the full transcript. The transcribed text is available in the text attribute:

1 docs[0].text
2 # "Load time, a new president and new congressional makeup. Same old ..."

The metadata contains the full transcript object with more meta information:

1 docs[0].metadata
2 # {'language_code': <LanguageCode.en_us: 'en_us'>,
3 #  'audio_url': 'https://assembly.ai/nbc.mp3',
4 #  'punctuate': True,
5 #  'format_text': True,
6 #   ...
7 # }

Transcript formats

You can specify the transcript_format argument to load the transcript in different formats.

Depending on the format, load_data() returns either one or more documents. These are the different TranscriptFormat options:

TEXT: One document with the transcription text
SENTENCES: Multiple documents, splits the transcription by each sentence
PARAGRAPHS: Multiple documents, splits the transcription by each paragraph
SUBTITLES_SRT: One document with the transcript exported in SRT subtitles format
SUBTITLES_VTT: One document with the transcript exported in VTT subtitles format

1 from llama_hub.assemblyai
2 
3 reader = AssemblyAIAudioTranscripReader(
4     file_path="./your_file.mp3",
5     transcript_format=TranscriptFormat.SENTENCES,
6 )
7 
8 docs = reader.load_data()

Transcription config

You can also specify the config argument to use different audio intelligence models.

1 import assemblyai as aai
2 
3 config = aai.TranscriptionConfig(speaker_labels=True,
4                                  auto_chapters=True,
5                                  entity_detection=True
6 )
7 
8 reader = AssemblyAIAudioTranscriptReader(
9     file_path="./your_file.mp3",
10     config=config
11 )

Pass the API key as argument

You can also pass the AssemblyAI API key as an argument instead of an environment variable.

1 reader = AssemblyAIAudioTranscriptReader(
2     file_path="./your_file.mp3",
3     api_key="<YOUR_API_KEY>"
4 )

Additional resources

You can learn more about using LlamaIndex with AssemblyAI in these resources.