Simple transparent pricing.
Only pay for what you use.
>90%
Transcription accuracy
17+
Available languages
1.1M
Hours of training data
Includes
Speech recognition
Dual channel transcription
Speaker diarization
Export SRT or VTT caption files
Auto punctuation and casing
Filler word filtering
Auto language detection
Profanity filtering
Custom spelling
Word search
Custom vocabulary
Streaming Speech-to-Text
Transcribe audio/video files synchronously with high accuracy and low latency.
$0.47
hour
Includes
Speech recognition with <600 ms of latency
Auto punctuation and casing
Custom vocabulary
Audio Intelligence
Unlock the information in your audio.
Model
Price
$0.01 / hour
$0.02 / hour
$0.03 / hour
$0.05 / hour
$0.08 / hour
$0.08 / hour
$0.08 / hour
$0.15 / hour
$0.15 / hour
Includes
Understand vast audio and video content efficiently.
Ensure audience safety with proactive content moderation.
Extract context for smarter content strategies.
Prioritize user privacy with automatic data redaction.
Seamlessly categorize, highlight, and optimize audio/video content.
LeMUR
The easiest way to build LLM apps on voice data.
Model
Input
Output
$0.015 / 1K tokens
$0.043 / 1K tokens
$0.015 / 1K tokens
$0.043 / 1K tokens
$0.002 / 1K tokens
$0.005 / 1K tokens
Includes
Question & Answer
Action Items
Custom Summary
Custom Task
Benefits
Single API to connect voice data to an LLM in a few lines of code
Optimized for high-quality LLM responses
Pricing calculator
Enter your estimated input size, output size, and final model to get a price estimate.
Enterprise
For businesses with large volumes, additional support needs, and/or bespoke use cases.
Includes
Enable AI at scale with additional concurrency
Develop custom integrations with an AssemblyAI engineer
Build, troubleshoot, and plan with dedicated support
Custom pricing based on your use case and needs
Compliance with EU Data Residency standards
Frequently asked questions
Do you offer volume discounts?
How fast does it take for audio and video files to process?
How does billing work?
How can I talk to someone?
What languages do you support?
What is a token?
Get started in seconds
1
2
3
4
5
6
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(URL, config)
print(transcript.text)