Transcribe a pre-recorded audio file
Overview
This guide walks you through transcribing your first audio file with AssemblyAI. You will learn how to submit an audio file for transcription and retrieve the results using the AssemblyAI API.
When transcribing an audio file, there are three main things you will want to specify:
- The speech models you would like to use (required).
- The region you would like to use (optional).
- Other models you would like to use like Speaker Diarization or PII Redaction (optional).
Prerequisites
Before you begin, make sure you have:
Python SDK
Python
JavaScript SDK
JavaScript
- An AssemblyAI API key (get one by signing up at assemblyai.com)
- Python 3.8 or later installed
- The
assemblyaipackage (pip install assemblyai)
Step 1: Set up your API credentials
First, configure your API endpoint and authentication:
Python SDK
Python
JavaScript SDK
JavaScript
Replace YOUR_API_KEY with your actual AssemblyAI API key.
Need EU data residency?
Use our EU endpoint by changing base_url to
"https://api.eu.assemblyai.com".
Step 2: Specify your audio source
You can transcribe audio files in two ways:
Python SDK
Python
JavaScript SDK
JavaScript
Option A: Use a publicly accessible URL
Option B: Use a local file
The SDK handles local file uploads automatically.
Step 3: Submit the transcription request
Create a request with your audio URL and desired configuration options:
Python SDK
Python
JavaScript SDK
JavaScript
This configuration:
- Uses both the
universal-3-proanduniversal-2models for broad language coverage. Learn more about our different speech recognition models here. - Uses our Automatic Language Detection model to detect the dominant language in the spoken audio.
- Uses our Speaker Diarization model to create turn-by-turn utterances.
Model Pricing
Pricing can vary based on the speech model used in the request.
If you already have an account with us, you can find your specific pricing on the Billing page of your dashboard. If you are a new customer, you can find general pricing information here.
Step 4: Poll for the transcription result
Transcription happens asynchronously. Poll the API until the transcription is complete:
Python SDK
Python
JavaScript SDK
JavaScript
The SDK handles polling automatically. Check the result:
Step 5: Access speaker diarization (optional)
If you enabled speaker labels, you can access the speaker-separated utterances:
Python SDK
Python
JavaScript SDK
JavaScript
Complete example
Here is the full working code:
Python SDK
Python
JavaScript SDK
JavaScript
Next steps
Now that you have transcribed your first audio file:
- Learn how you can do even more with Universal-3-Pro with prompting
- Explore our Speech Understanding features for more ways to analyze your audio data
- Learn more about searching, summarizing, or asking questions on your transcript with our LLM Gateway feature
- Find out how to use webhooks to get notified when your transcripts are ready
For more information, check out the full API reference documentation.