Introducing Universal-2

Next-gen Speech AI for next-level product experiences

Our most advanced speech-to-text model captures the complexity of human speech for impeccable audio data that powers sharper insights, faster workflows, and best-in-class product experiences.

Leading the industry—again and again

Universal-2 builds on the strengths of Universal-1 with even greater accuracy and precision for audio data that doesn’t need double checking.

80%

85%

90%

95%

Universal-2

Universal-1

OpenAI

Microsoft

Deepgram

Amazon

Google

Metric

AssemblyAI

Universal-2

AssemblyAI

Universal-1

OpenAI

Whisper Large-v3

Microsoft

Azure Batch v3.1

Deepgram

Nova 2

Amazon

Amazon Transcribe

Google

Latest-long

Word Accuracy Rate
93.3%
93.1%
91.7%
91.2%
90.8%
89.7%
85.2%

Built on top of the best—then made even better

Accuracy is more than just the right words—it’s trust in your data. Universal-2 lets users spend less time filling in the gaps and more time putting insight into action.

WHAT’S IMPROVED

Proper nouns

A 24% improvement in the recognition of rare words like names, brands, locations, and more for more personalized customer-facing communications, intuitive automated systems, and cleaner integration processes.

An illustration on a blue purple background showing Universal-2's improvements in handling proper nouns compared to Universal-1An illustration on a blue purple background showing Universal-2's improvements in handling proper nouns compared to Universal-1

Text formatting

A 15% improvement in transcript structure with proper punctuation and casing across things like emails, dates, and dollar amounts for faster information navigation and more natural transcripts in customer products.

An illustration on a blue purple background showing Universal-2's text formatting improvements compared to Universal-1An illustration on a blue purple background showing Universal-2's text formatting improvements compared to Universal-1

Alphanumerics

A 21% increase in accuracy across critical data like phone numbers, zip codes, and other numerical identifiers for smoother customer experiences, better critical data management, and clearer escalation and reporting.

An illustration on a blurple background showing Universal-2's alphanumeric improvements compared to Universal-1An illustration on a blurple background showing Universal-2's alphanumeric improvements compared to Universal-1

Universal-2 captures real-world complexity

With reduced word-error rates in 3 key areas.

Universal-2

Universal-1

OpenAI

Microsoft

Deepgram

Amazon

Google

*Truncated at 25% for visualization

0%

5%

10%

15%

20%

25%

Proper nouns

Text formatting

Alphanumerics

Metric

AssemblyAI

Universal-2

AssemblyAI

Universal-1

OpenAI

Whisper Large-v3

Microsoft

Azure Batch v3.1

Deepgram

Nova 2

Amazon

Amazon Transcribe

Google

Latest-long

Proper nouns (Jaro-Winkler Error Rate)
13.87%
18.17%
15.41%
26.84%
21.14%
37.57%
47.64%
Text formatting (Word Error Rate)
10.06%
11.77%
12.01%
12.14%
12.39%
14.47%
25.45%
Alphanumerics (Word Error Rate)
4.00%
5.06%
3.84%
5.19%
4.97%
6.24%
8.43%

It’s more than accurate—it’s the industry preference

Universal-2 is the most preferred model to date. Before that? Universal-1 took the cake. We’ve made a habit out of making models people love.

Universal-2

72.9%

Universal-1

25.9%

Neutral

1.2%

*Qualitative benchmarks from an unbiased, third-party evaluation.

The cleanest outputs in the industry

Universal-2 is closing the gap between transcription and true understanding, with best-in-breed audio data you can reliably stand on—and behind.

More on Universal-2

Research

Universal-2 is the latest milestone in AssemblyAI's mission to push the boundaries of Speech AI technology and unlock the full potential of voice data for all.

Explore the research

Playground

Access our production-ready Speech AI models for speech recognition, speaker detection, audio summarization, and more—all in our no-code playground.

Try our Playground

Pricing

Universal-2 is available as an API for developers to build applications and services. We offer pricing that scales with tiered payment options and custom volume discounts.

Get our pricing

Universal-2 is all-in-one

Our comprehensive system lets you build expertly, effortlessly on our developer-preferred API with leading Speech AI capabilities, built-in model updates, and tech that keeps you on the cutting edge.

1
2
3
4
5
6
import assemblyai as aai

transcriber = aai.Transcriber()
transcript = transcriber.transcribe(URL, config)

print(transcript)
{
  "id": "6rlr37h8f4-e310-4e23-bbf3-ea5f347dc684",
  "language_code": "en_us",
  "status": "completed",
  "text": "Runner's knee is a condition characterized by pain behind or around the kneecap...",
  "confidence": 0.98122,
  "audio_duration": 3200,
  "words": [
    { "text": "Runner's", "start": 0, "end": 550, "speaker": "A", "confidence": 0.98113 },
    { "text": "knee", "start": 580, "end": 1130, "speaker": "A", "confidence": 0.95417 }
  ]
}