LlamaIndex TypeScript Integration with AssemblyAI | AssemblyAI

You can use the AssemblyAI readers from LlamaIndex.TS to transcribe audio files inside your LlamaIndex applications.

Looking for the Python integration? Check out the LlamaIndex Python integration.

Quickstart

Install LlamaIndex.TS by following their instructions.

To use the loaders, you need an AssemblyAI account and get your AssemblyAI API key from the dashboard. Configure the API key as the ASSEMBLYAI_API_KEY environment variable or the apiKey options parameter.

1 import {
2   AudioTranscriptReader,
3   AudioTranscriptParagraphsReader,
4   AudioTranscriptSentencesReader,
5 } from "llamaindex";
6 
7 // You can also use a local file path and the loader will upload it to AssemblyAI for you.
8 const audioUrl = "https://assembly.ai/espn.m4a";
9 
10 const reader = new AudioTranscriptReader({
11   apiKey: "<ASSEMBLYAI_API_KEY>", // or set the `ASSEMBLYAI_API_KEY` env variable
12 });
13 
14 // Transcribe audio and store transcript in documents
15 const docs = await reader.loadData({
16   audio: audioUrl,
17   language_code: "en_us",
18   // any other parameters as documented here: https://www.assemblyai.com/docs/api-reference/transcript#create-a-transcript
19 });
20 
21 console.dir(docs, { depth: Infinity });

You can use the AudioTranscriptParagraphsReader or AudioTranscriptSentencesReader to split the transcript into paragraphs or sentences. - The audio parameter can be a URL, a local file path, a file buffer, or a stream. - The audio can also be a video file. See the list of supported file types in the FAQ doc.
If you don’t pass in the apiKey option, the loader will use the ASSEMBLYAI_API_KEY environment variable. - You can add more properties in addition to audio. Find the full list of request parameters in the AssemblyAI API docs.

You can also use the AudioSubtitlesReader to get srt or vtt subtitles as a document.

1 import { AudioSubtitlesReader } from "llamaindex";
2 
3 // You can also use a local file path and the loader will upload it to AssemblyAI for you.
4 const audioUrl = "https://assembly.ai/espn.m4a";
5 
6 const reader = new AudioSubtitlesReader({
7   apiKey: "<ASSEMBLYAI_API_KEY>", // or set the `ASSEMBLYAI_API_KEY` env variable
8 });
9 
10 // Transcribe audio and store transcript in documents
11 const docs = await reader.loadData(
12   {
13     audio: audioUrl,
14     language_code: "en_us",
15     // any other parameters as documented here: https://www.assemblyai.com/docs/api-reference/transcript#create-a-transcript
16   },
17   "srt" // srt or vtt
18 );
19 
20 console.dir(docs, { depth: Infinity });

Additional resources

You can learn more about using LlamaIndex.TS with AssemblyAI in these resources:

The AssemblyAI audio reader references