April 12, 2024

Newsletter #30: 🚀 Universal-1 Model Launch

This week, we’ve launched Universal-1, our most powerful and accurate Speech-to-Text model to date, trained on 12.5M hours of multilingual audio data

Smitha Kolan

Developer Educator

Smitha Kolan

Developer Educator

Reviewed by

No items found.

Table of contents

[Visible on live site]

Get $50 in credits

Hey 👋, this weekly update contains the latest info on our new product features, tutorials, and our community.

🚀Universal-1 Model Launch

This week, we’ve launched Universal-1, our most powerful and accurate Speech-to-Text model to date, trained on 12.5M hours of multilingual audio data.

Key Highlights of Universal-1:

71% better speaker count estimation and 14% better word timestamp estimation compared to our prior models
Up to 30% fewer hallucinations compared to Whisper Large-v3, ensuring cleaner, more reliable transcriptions.
Over 22% more accurate compared to speech-to-text APIs from Azure, AWS, and Google.
Ability to code switch, transcribing multiple languages within a single audio file.
And, it processes an hour of audio in just 38 seconds. ⚡️

Check out our docs to start building with Universal-1.

Fresh From Our Blog

Transcribe an audio file with Universal-1 using Go: Learn how to transcribe an audio file in your Go applications with industry-leading accuracy using Universal-1. Read more>>

Automatically redact PII from audio and video with Python: Learn how to automatically redact Personal Identifiable Information (PII) from audio and video files in 5 minutes using Python and AssemblyAI. Read more>>

How to do Speech-To-Text with Go: Integrate speech recognition into your Go application in only a few lines of code. Read more>>

Our Trending YouTube Tutorials

This new model is transforming Speech AI: Accurate, Fast, Cost-Effective: AssemblyAI just launched Universal-1, our most capable and highly trained speech recognition model.

Coding an AI Voice Bot from Scratch: Real-Time Conversation with Python: Learn how to build a real-time AI voice assistant using Python that transcribes real-time speech, generates AI responses, and provides a human-like conversational experience.

How to Build a RAG Application for Multi-Speaker Audio Data: Learn how to build a RAG application in 10 minutes that can take multiple speakers into account when answering a question.

Newsletter #30: 🚀 Universal-1 Model Launch

🚀Universal-1 Model Launch

Fresh From Our Blog

Our Trending YouTube Tutorials

Related posts