For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
PlaygroundChangelogSign In
OverviewAPI ReferencePre-recorded STTStreaming STTVoice AgentsSpeech UnderstandingGuardrailsLLM GatewayFAQ
OverviewAPI ReferencePre-recorded STTStreaming STTVoice AgentsSpeech UnderstandingGuardrailsLLM GatewayFAQ
  • Getting started
    • Overview
    • Build with AI coding agents
    • Models
    • Evaluate model accuracy
    • Manage your account
    • Introducing Universal-3 Pro
    • End-to-end examples
  • Use cases & integrations
    • Use case guides
    • Integrations
      • AI coding agents
        • LangChain
          • Python
          • JavaScript
        • Power Automate
        • Semantic Kernel
        • Activepieces
        • Haystack
        • Cloudflare
        • Relay.app
        • Bubble by Knowcode
        • Pipedream
        • Drupal
  • Trust & security
    • Trust center
    • Security overview
    • Data retention and model training
LogoLogo
PlaygroundChangelogSign In
On this page
  • Quickstart
  • Transcript formats
  • Transcription config
  • Pass the AssemblyAI API key as an argument
  • Additional resources
Use cases & integrationsIntegrationsCommunity-maintained toolsLangChain

πŸ¦œοΈπŸ”— LangChain Python Integration with AssemblyAI

Was this page helpful?
Built with

To apply LLMs to speech, you first need to transcribe the audio to text, which is what the AssemblyAI integration for LangChain helps you with.

Looking for the LangChain JavaScript integration?
Go to the LangChain.JS integration.

Quickstart

Install the AssemblyAI package and the AssemblyAI Python SDK:

$pip install langchain
$pip install assemblyai

Set your AssemblyAI API key as an environment variable named ASSEMBLYAI_API_KEY. You can get a free AssemblyAI API key from the AssemblyAI dashboard.

$# Mac/Linux:
$export ASSEMBLYAI_API_KEY=YOUR_API_KEY
$
$# Windows:
$set ASSEMBLYAI_API_KEY=YOUR_API_KEY

Import the AssemblyAIAudioTranscriptLoader from langchain.document_loaders.

1from langchain.document_loaders import AssemblyAIAudioTranscriptLoader
  1. Pass the local file path or URL as the file_path argument of the AssemblyAIAudioTranscriptLoader.
  2. Call the load method to get the transcript as LangChain documents.
1audio_file = "https://assembly.ai/sports_injuries.mp3"
2# or a local file path: audio_file = "./sports_injuries.mp3"
3
4loader = AssemblyAIAudioTranscriptLoader(file_path=audio_file)
5
6docs = loader.load()

The load method returns an array of documents, but by default, there’s only one document in the array with the full transcript.

The transcribed text is available in the page_content attribute:

1docs[0].page_content
2# Load time, a new president and new congressional makeup. Same old ...

The metadata contains the full JSON response with more meta information:

1{
2 'language_code': <LanguageCode.en_us: 'en_us'>,
3 'audio_url': 'https://assembly.ai/nbc.mp3',
4 'punctuate': True,
5 'format_text': True,
6 ...
7}

Transcript formats

You can specify the transcript_format argument to load the transcript in different formats.

Depending on the format, load_data() returns either one or more documents. These are the different TranscriptFormat options:

  • TEXT: One document with the transcription text
  • SENTENCES: Multiple documents, splits the transcription by each sentence
  • PARAGRAPHS: Multiple documents, splits the transcription by each paragraph
  • SUBTITLES_SRT: One document with the transcript exported in SRT subtitles format
  • SUBTITLES_VTT: One document with the transcript exported in VTT subtitles format
1import assemblyai as aai
2from langchain.document_loaders import AssemblyAIAudioTranscriptLoader
3from langchain.document_loaders.assemblyai import TranscriptFormat
4
5loader = AssemblyAIAudioTranscriptLoader(
6 file_path="./your_file.mp3",
7 transcript_format=TranscriptFormat.SENTENCES,
8)
9
10docs = loader.load()

Transcription config

You can also specify the config argument to use different transcript features and speech understanding models. Here’s an example of using the config argument to enable speaker labels, auto chapters, and entity detection:

1import assemblyai as aai
2from langchain.document_loaders import AssemblyAIAudioTranscriptLoader
3
4config = aai.TranscriptionConfig(
5 speaker_labels=True, auto_chapters=True, entity_detection=True
6)
7
8loader = AssemblyAIAudioTranscriptLoader(file_path="./your_file.mp3", config=config)

For the full list of options, see Transcript API reference.

Pass the AssemblyAI API key as an argument

Instead of configuring the AssemblyAI API key as the ASSEMBLYAI_API_KEY environment variable, you can also pass it as the api_key argument.

1loader = AssemblyAIAudioTranscriptLoader(
2 file_path="./your_file.mp3", api_key="<YOUR_API_KEY>"
3)

Additional resources

You can learn more about using LangChain with AssemblyAI in these resources.

  • LangChain docs for the AssemblyAI document loader
  • How to use audio data in LangChain with Python
  • Retrieval Augmented Generation on audio data with LangChain and Chroma
  • Build LangChain Audio Apps with Python in 5 Minutes
  • How to use LangChain for RAG over audio files
  • AssemblyAI Python SDK