Changelog
Follow along to see weekly accuracy and product improvements.
Introducing Universal-Streaming
Universal-Streaming is our new speech-to-text (STT) model 🚀

What's Improved:
- Ultra-low latency with immutable transcripts - Universal-Streaming delivers ~300ms word emission with 41% faster median latency than Deepgram Nova-3, provides immutable final transcripts from the start to enable real-time agent processing, and offers latency-tunable features like the ability to toggle punctuation for maximum speed.
- Intelligent endpointing for smoother turn detection - Our end-of-turn model enhances speed and accuracy, supporting natural pauses without premature interruptions for smoother conversations.
- Accuracy on the tokens that matter - Universal-Streaming delivers substantial improvements in these challenging areas: 21% fewer alphanumeric errors on emails and codes, 28% improvement on consecutive numbers, and 5% better proper noun recognition. These improvements ensure fewer correction loops and silent transcription errors.
- Transparent pricing with unlimited concurrency - Pricing starts at $0.15/hr with volume discounts available for larger implementations. Scale confidently with unlimited concurrent streams with no hard caps or over-stream surcharges.
Learn more about Universal-Streaming in our blog and review our comprehensive Getting Started Guide for detailed implementation information.
Slam-1 bugfix
We’ve fixed a bug on Slam-1 where users' keyterms_prompt
value was occasionally appearing in the transcript text.
Error Message Improvement
Optimized error message for instances where the region used to upload a file via the /upload
endpoint does not match the region being used to transcribe that URL.
‍
Enhanced Account Security
We've added Email Verification and Google OAuth:
- Google authentication users: If your account email is a Gmail address, you can simply click 'Continue with Google' for instant access, followed by account verification - no additional linking is needed.‍
- Email/password users: On your first login after this update, you'll receive a one-time link to reset your password. Simply click the link to reset your new password and access your dashboard.
New LeMUR Models
We've expanded LeMUR capabilities with two powerful new models:
- Claude 3.7 Sonnet - The most intelligent model to date, featuring enhanced reasoning capabilities for complex audio analysis tasks.
- Claude 3.5 Haiku - The fastest model, optimized for quick responses while maintaining excellent reasoning abilities.
Whether you're analyzing customer calls, generating meeting summaries, or performing audio content analysis, these models deliver significant improvements.Â
You can begin using these new models right away with your existing LeMUR implementation. For detailed instructions on integration, model parameters, and code examples across all supported programming languages, check out our docs.
🚀 Slam-1 Public Beta 🚀
Slam-1, our new customizable Speech Language Model, is now available in public beta!Â
Slam-1 combines large language model reasoning with specialized audio processing to understand speech rather than just recognize it. This multi-modal architecture enables new levels of accuracy, adaptability, and control over speech transcription with high-demand features including speaker diarization, timestamp prediction, and multichannel transcription, and can be used as a drop-in replacement to improve the accuracy of existing models.The standout capability of Slam-1 is its ability to be fine-tuned for specific contexts without model retraining or complex engineering, adapting to capture the terminology and nuances across various fields from healthcare to legal proceedings.
Performance Highlights:
- 66% of human evaluators consistently preferred Slam-1 transcripts over our current Universal model and 72% of users preferred Slam-1 transcripts in blind tests over Deepgram’s Nova-3 model
%20Blog%20-%20Slam-1-min.png)
- 20% reduction in formatting errors
- Up to 66% reduction in missed entities (names, places, custom terms) with customization
%20Blog%20-%20Slam-1-min.png)
‍Refer to our documentation for information about getting started and check out our blog post to learn more about Slam-1.
Dashboard Updates; Scaling Optimization; LeMUR bugfix
Introducing Dark Mode for our dashboard! Users can now switch between light and dark mode via a toggle in the top navigation bar.

Optimized scaling and capacity provisioning to more efficiently handle customer traffic.
Reduced LeMUR errors with targeted improvements to help alleviate scaling issues.
AssemblyAI is now PCI DSS v4.0 Compliant; bugfix
We've upgraded our PCI compliance to PCI DSS v4.0, ensuring our Speech-to-Text API meets the latest payment card industry security standards.
Added additional retry logic to reduce edge case authentication errors that would sometimes occur.
Dashboard Revamp
We have upgraded our dashboard—now with enhanced analytics and improved navigation to help you get more out of your AssemblyAI account.Â

The new dashboard features:
- Modern UI with improved navigation and streamlined onboarding
- Enhanced analytics with usage and model-specific filtering
- Advanced transcription history with filtering by date, ID, project, and API key
- Dedicated rate limits section showing your account's limits for all endpoints
- Clearer billing information with improved plan details and usage visualization
Our multiple API keys feature is fully integrated with the new dashboard, allowing you to better organize projects, and enhance security.
Log in to your AssemblyAI account today to experience the improved interface.
Speaker Labels bugfix
Reduced edge case errors with the Speaker Labels feature that could sometimes occur when the final utterance was a single word.
Multiple API Keys & Projects
We’ve introduced Multiple API Keys and Projects for AssemblyAI accounts. You can now create separate projects for development, staging, and production, making it easier to manage different environments. Within each project, you can set up multiple API keys and track detailed usage and spending metrics. All billing remains centralized while ensuring a clear separation between projects for better organization and control.
Easily manage different environments and streamline your workflow. Visit your dashboard to get started! 🚀
Universal improvements
Last week we delivered improvements to our October 2024 Universal release across latency, accuracy, and language coverage.
Universal demonstrates the lowest standard error rate when compared to leading models on the market for English, German, and Spanish:

Additionally, these improvements to accuracy are accompanied by significant increases in processing speed. Our latest Universal release achieves a 27.4% speedup in inference time for the vast majority of files (at the 95th percentile), enabling faster transcription at scale.
Additionally, these changes build on Universal's already best-in-class English performance to bring significant upgrades to last-mile challenges, meaning that Universal faithfully captures the fine details that make transcripts useable, like proper nouns, alphanumerics, and formatting.

You can read our launch blog to learn more about these Universal updates.
Ukrainian support for Speaker Diarization
Our Speaker Diarization service now supports Ukrainian speech. This update enables automatic speaker labeling for Ukrainian audio files, making transcripts more readable and powering downstream features in multi-speaker contexts.
Here's how you can get started obtaining Ukrainian speaker labels using our Python SDK:
import assemblyai as aai
aai.settings.api_key = "<YOUR_API_KEY>"
audio_file = "/path/to/your/file"
config = aai.TranscriptionConfig(
speaker_labels=True,
language_code="uk"
)
transcript = aai.Transcriber().transcribe(audio_file, config)
for utterance in transcript.utterances:
print(f"Speaker {utterance.speaker}: {utterance.text}")
Check out our Docs for more information.
Claude 2 sunset
As previously announced, we sunset Claude 2 and Claude 2.1 for LeMUR on February 6th.
If you were previously using these models, we recommended switching to Claude 3.5 Sonnet, which is both more performant and less expensive than Claude 2. You can do so via the final_model
parameter in LeMUR requests. Additionally, this parameter is now required.
Additionally, we have sunset the lemur/v3/generate/action-items
endpoint.
Reduced hallucination rates; Bugfix
We have reduced Universal-2's hallucination rate for the string "sa" during periods of silence.
We have fixed a rare bug in our Speaker Labels service that would occasionally cause requests to fail and return a server error.
Multichannel audio trim fix
We've fixed an issue which caused the audio_start_from
and audio_end_at
parameters to not be respected for multichannel audio.
Platform enhancements and security updates
🌍 Simplified EU Data Residency & Management
We've simplified EU operations with instant access to:
- Self-serve EU data processing via our EU endpointComplete data sovereignty for EU operations
- Regional usage filtering and cost tracking
- Reduced latency for EU-based operations
âś… Enhanced Security & Compliance
- Full-scope SOC 2 Type 2 certification across all Trust Service Criteria
- ISO 27001 certification achievement
- Enhanced security controls across our infrastructure
You can read more about these new enhancements in our related blog.
Reduced hallucination rates
We have reduced Universal-2's hallucination rate for the word "it" during periods of silence.
New dashboard features
Two new features are available to users on their dashboards:
- Users can now see and filter more historical usage and spend data
- Users can now see usage and spend by the hour for a given day
Reliability improvements
We've made reliability improvements for Claude models in our LeMUR framework.
We've made adjustments to our infrastructure so that users should see fewer timeout errors when using our Nano tier with some languages.