Speech-to-Text for Developers

AssemblyAI is an extremely accurate, customizable, and easy to use Speech-to-Text API. Transcribe audio from phone calls, podcasts, and more.

import assemblyai

aai = assemblyai.Client(token='your-api-token')

# Optional: boost accuracy for keywords/phrases
phrases = ['cancel my account', 'Yiqin Dai', ...]
model = aai.train(phrases)

# Transcribe audio in any format
url = 'https://foo.com/bar.mp3'
transcript = aai.transcribe(audio_url=url, model=model)

Why choose AssemblyAI?

AssemblyAI uses advanced deep learning technology to generate extremely accurate transcriptions for your audio. See how we compare to services like Google and AWS Transcribe.

AssemblyAI

oh i'd say there's no such thing as eating too much but i just haven't massively fast metabolism and so i constantly eating so

Google (phone call model)

i see this message thing is eating too much but i just haven't massively fast metabolism and snow i constantly ding

Completely customizable to your application

Easily boost accuracy for keywords or phrases that are important, or add thousands of custom words to the vocabulary, to fine-tune the recognition for your specific needs.

Boost Accuracy for Keywords/Phrases

Add Custom Vocabulary

import assemblyai

aai = assemblyai.Client(token='your-api-token')

# Boost accuracy for an unlimited amount of important
# keywords/phrases by creating a custom model
# Models take around 6 minutes to train.
phrases = ['cancel my account', 'Dirk Gently', ...]
model = aai.train(phrases)

# Transcribe audio in any format
url = 'https://foo.com/bar.mp3'
transcript = aai.transcribe(audio_url=url, model=model)

text = transcript.text

{
"transcript": {
  "id": 40,
  "status": "completed",
  "created": "2017-11-12T05:00:05.113353Z",
  "audio_src_url": "https://foo.com/bar.wav",
  "model_id": null,
  "text": "Welcome to AssemblyAI.",
  "confidence": 0.98,
  "segments": [...],
  "speaker_count": null
 }
}

More Reasons to Choose AssemblyAI

We're a team of researchers and engineers, focused on building the best Speech-to-Text API for your company or product. We use the latest and most advanced deep learning technologies to make sure you always have the best Speech-to-Text in your application when you choose AssemblyAI.

Continuously Improving
Continuously Improving

Every few weeks we ship accuracy improvements based on the most current deep learning research.

Custom Models
Custom Models

With custom models, you can recognize thousands of custom words, and boost accuracy for important keywords/phrases.

continuously improving
Amazing Support

Talk to humans when you need help. Our team of developers are ready to answer your questions via Slack and Email to help you however we can.

continuously improving
More Affordable

Pricing is a simple $0.0003 per second of audio sent to the API, billed monthly, without any weird rounding or minimums.

continuously improving
Supports All Audio Formats

The API accepts virtually any audio format, even lossy and low bitrate audio commonly found in phone calls. No need to worry about sample rates, bit rates, encodings, or other tricky signal processing terminologies.

continuously improving
Secure and Private

We believe in privacy, and never store or copy the audio data you send to the API.

Ready to get started?