LlamaIndex TypeScript Integration with AssemblyAI
You can use the AssemblyAI readers from LlamaIndex.TS to transcribe audio files inside your LlamaIndex applications.
Looking for the Python integration? Check out the LlamaIndex Python integration.
Quickstart
Install LlamaIndex.TS by following their instructions.
To use the loaders, you need an AssemblyAI account and get your AssemblyAI API key from the dashboard.
Configure the API key as the ASSEMBLYAI_API_KEY
environment variable or the apiKey
options parameter.
import {
AudioTranscriptReader,
AudioTranscriptParagraphsReader,
AudioTranscriptSentencesReader,
} from "llamaindex";
// You can also use a local file path and the loader will upload it to AssemblyAI for you.
const audioUrl = "https://assembly.ai/espn.m4a";
const reader = new AudioTranscriptReader({
apiKey: "<ASSEMBLYAI_API_KEY>", // or set the `ASSEMBLYAI_API_KEY` env variable
});
// Transcribe audio and store transcript in documents
const docs = await reader.loadData({
audio: audioUrl,
language_code: "en_us",
// any other parameters as documented here: https://www.assemblyai.com/docs/api-reference/transcript#create-a-transcript
});
console.dir(docs, { depth: Infinity });
info
- You can use the
AudioTranscriptParagraphsReader
orAudioTranscriptSentencesReader
to split the transcript into paragraphs or sentences. - The
audio
parameter can be a URL, a local file path, a file buffer, or a stream. - The
audio
can also be a video file. See the list of supported file types in the FAQ doc. - If you don't pass in the
apiKey
option, the loader will use theASSEMBLYAI_API_KEY
environment variable. - You can add more properties in addition to
audio
. Find the full list of request parameters in the AssemblyAI API docs.
You can also use the AudioSubtitlesReader
to get srt
or vtt
subtitles as a document.
import { AudioSubtitlesReader } from "llamaindex";
// You can also use a local file path and the loader will upload it to AssemblyAI for you.
const audioUrl = "https://assembly.ai/espn.m4a";
const reader = new AudioSubtitlesReader({
apiKey: "<ASSEMBLYAI_API_KEY>", // or set the `ASSEMBLYAI_API_KEY` env variable
});
// Transcribe audio and store transcript in documents
const docs = await reader.loadData(
{
audio: audioUrl,
language_code: "en_us",
// any other parameters as documented here: https://www.assemblyai.com/docs/api-reference/transcript#create-a-transcript
},
"srt" // srt or vtt
);
console.dir(docs, { depth: Infinity });
Additional resources
You can learn more about using LlamaIndex.TS with AssemblyAI in these resources:
The AssemblyAI audio reader references