Introducing Universal-3 Pro | AssemblyAI

Overview

Universal-3 Pro is our most powerful Voice AI model yet, designed to capture the “hard stuff” that traditional ASR models struggle with. The model out of the box outperforms all ASR models on the market on accuracy, especially as it pertains to entities and rare words. With prompting, you can get an entirely customized transcription output that rivals near-human-level transcription.

Universal-3 Pro is available for both pre-recorded (async) and streaming use cases. Configuration and settings differ between the two because streaming is optimized for real-time audio utterances typically under 10 seconds, with special efficiencies built into the model for low-latency turn detection and voice agent workflows.

Based on your use case, navigate to the appropriate guide below:

Universal-3 Pro Async

For pre-recorded audio files. Supports long-form audio, prompting, keyterms prompting, and full language detection.

Universal-3 Pro Streaming

For real-time audio streams. Optimized for low-latency turn detection, voice agents, and live transcription.