Speech Understanding
Gain maximum value from voice data with audio intelligence models, and leverage LLM capabilities with LeMUR to extract insights, generate summaries, and more.
The customer, John, called Acme Corporation's customer service department to report a malfunction with his widget. Sarah, the representative, attempted troubleshooting but concluded that the widget needed repair under warranty.
Customer Service, Product Support, Warranty, Troubleshooting, Repair
Acme Corporation, Sarah, John, Acme Widget, malfunction, serial number, batteries, troubleshooting, repair, warranty, prepaid shipping label
Extract valuable insights from voice data
Audio Intelligence
AI models to summarize speech, redact personal information, detect hateful content, identify spoken topics, and more.
LeMUR
With a single API call, summarize meetings, generate call insights, recap action items, and more on over 100 hours of audio data.
Audio Intelligence
Feature-rich AI models
Summarization
Leverage our AI-powered Summarization models to automatically summarize audio/video data in your products at scale. Customize the summary types to best fit your use case.
See how in docs
Content Moderation
Detect sensitive content in your audio and video files - such as hate speech, violence, sensitive social issues, alcohol, drugs, and more.
See how in docs
Sentiment Analysis
With Sentiment Analysis, AssemblyAI can detect the sentiment of each sentence of speech spoken in your audio files.
See how in docs
Entity Detection
Identify a wide range of entities that are spoken in your audio files, such as person and company names, email addresses, dates, and locations.
See how in docs
PII Redaction
Identify and remove Personally Identifiable Information, such as phone numbers and social security numbers, from the transcription text before it is returned to you.
See how in docs
Topic Detection (IAB Classification)
Label the topics that are spoken in your audio and video files. The predicted topic labels follow the standardized IAB Taxonomy, which makes them suitable for contextual targeting.
See how in docs
Auto Chapters
Automatically generate a summary over time for audio and video files.
See how in docs
Key Phrases
Accurately identify significant words and phrases, enabling you to extract the most pertinent concepts or highlights from your audio/video file.
See how in docs
See everything in docsLeMUR
Leverage LLM capabilities and take action on your audio data
Ask questions
Get instant answers to questions about your audio.
Create summaries
Summarize your audio data with key takeaways.
Extract data
Extract data such as topic tags from your audio to categorize and organize your audio data.
Generate content
Generate long-form or short-form written content using your audio data.
Rapidly ship high-quality Generative AI features with voice data
Unifies your AI stack for audio
Powered by AssemblyAI’s speech recognition models
Continuously updated with the latest in research
Launch quickly & effortlessly scale
Explore more
Speech-to-Text
Build on top of the most accurate Speech-to-Text model on the market with >93% accuracy.
Streaming Speech-to-Text
Transcribe audio streams synchronously with high accuracy and low latency.
Get started in seconds
1
2
3
4
5
6
import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(URL, config)
print(transcript)
{
"id": "6rlr37h8f4-e310-4e23-bbf3-ea5f347dc684",
"language_code": "en_us",
"status": "completed",
"text": "Runner's knee is a condition characterized by pain behind or around the kneecap...",
"confidence": 0.98122,
"audio_duration": 3200,
"words": [
{ "text": "Runner's", "start": 0, "end": 550, "speaker": "A", "confidence": 0.98113 },
{ "text": "knee", "start": 580, "end": 1130, "speaker": "A", "confidence": 0.95417 }
]
}