Getting started

Transcribe a pre-recorded audio file

Learn how to transcribe and analyze an audio file.

Overview

This guide walks you through transcribing your first audio file with AssemblyAI. You will learn how to submit an audio file for transcription and retrieve the results using the AssemblyAI API.

When transcribing an audio file, there are three main things you will want to specify:

  1. The speech models you would like to use (required).
  2. The region you would like to use (optional).
  3. Other models you would like to use like Speaker Diarization or PII Redaction (optional).

Prerequisites

Before you begin, make sure you have:

  • An AssemblyAI API key (get one by signing up at assemblyai.com)
  • Python 3.8 or later installed
  • The assemblyai package (pip install assemblyai)

Step 1: Set up your API credentials

First, configure your API endpoint and authentication:

1import assemblyai as aai
2
3aai.settings.base_url = "https://api.assemblyai.com"
4aai.settings.api_key = "YOUR_API_KEY"

Replace YOUR_API_KEY with your actual AssemblyAI API key.

Need EU data residency?

Use our EU endpoint by changing base_url to "https://api.eu.assemblyai.com".

Step 2: Specify your audio source

You can transcribe audio files in two ways:

Option A: Use a publicly accessible URL

1audio_file = "https://assembly.ai/wildfires.mp3"

Option B: Use a local file

1audio_file = "./example.mp3"

The SDK handles local file uploads automatically.

Step 3: Submit the transcription request

Create a request with your audio URL and desired configuration options:

1config = aai.TranscriptionConfig(
2 speech_models=["universal-3-pro", "universal-2"],
3 language_detection=True,
4 speaker_labels=True,
5)
6
7transcript = aai.Transcriber().transcribe(audio_file, config=config)

This configuration:

Model Pricing

Pricing can vary based on the speech model used in the request.

If you already have an account with us, you can find your specific pricing on the Billing page of your dashboard. If you are a new customer, you can find general pricing information here.

Step 4: Poll for the transcription result

Transcription happens asynchronously. Poll the API until the transcription is complete:

The SDK handles polling automatically. Check the result:

1if transcript.status == aai.TranscriptStatus.error:
2 raise RuntimeError(f"Transcription failed: {transcript.error}")
3
4print(f"\nFull Transcript:\n\n{transcript.text}")

Step 5: Access speaker diarization (optional)

If you enabled speaker labels, you can access the speaker-separated utterances:

1for utterance in transcript.utterances:
2 print(f"Speaker {utterance.speaker}: {utterance.text}")

Complete example

Here is the full working code:

1import assemblyai as aai
2
3aai.settings.base_url = "https://api.assemblyai.com"
4aai.settings.api_key = "YOUR_API_KEY"
5
6# Use a publicly-accessible URL
7audio_file = "https://assembly.ai/wildfires.mp3"
8
9# Or use a local file:
10# audio_file = "./example.mp3"
11
12config = aai.TranscriptionConfig(
13 speech_models=["universal-3-pro", "universal-2"],
14 language_detection=True,
15 speaker_labels=True,
16)
17
18transcript = aai.Transcriber().transcribe(audio_file, config=config)
19
20if transcript.status == aai.TranscriptStatus.error:
21 raise RuntimeError(f"Transcription failed: {transcript.error}")
22
23print(f"\nFull Transcript:\n\n{transcript.text}")
24
25# Optionally print speaker diarization results
26# for utterance in transcript.utterances:
27# print(f"Speaker {utterance.speaker}: {utterance.text}")

Next steps

Now that you have transcribed your first audio file:

For more information, check out the full API reference documentation.