Changelog

Follow along to see weekly accuracy and product improvements.

May 12, 2025

Error Message Improvement

Optimized error message for instances where the region used to upload a file via the /upload endpoint does not match the region being used to transcribe that URL.

May 7, 2025

Enhanced Account Security

We've added Email Verification and Google OAuth:

  • Google authentication users: If your account email is a Gmail address, you can simply click 'Continue with Google' for instant access, followed by account verification - no additional linking is needed.
  • Email/password users: On your first login after this update, you'll receive a one-time link to reset your password. Simply click the link to reset your new password and access your dashboard.
April 30, 2025

New LeMUR Models

We've expanded LeMUR capabilities with two powerful new models:

  • Claude 3.7 Sonnet - The most intelligent model to date, featuring enhanced reasoning capabilities for complex audio analysis tasks.
  • Claude 3.5 Haiku - The fastest model, optimized for quick responses while maintaining excellent reasoning abilities.

Whether you're analyzing customer calls, generating meeting summaries, or performing audio content analysis, these models deliver significant improvements. 

You can begin using these new models right away with your existing LeMUR implementation. For detailed instructions on integration, model parameters, and code examples across all supported programming languages, check out our docs.

April 28, 2025

🚀 Slam-1 Public Beta 🚀

Slam-1, our new customizable Speech Language Model, is now available in public beta! 

Slam-1 combines large language model reasoning with specialized audio processing to understand speech rather than just recognize it. This multi-modal architecture enables new levels of accuracy, adaptability, and control over speech transcription with high-demand features including speaker diarization, timestamp prediction, and multichannel transcription, and can be used as a drop-in replacement to improve the accuracy of existing models.The standout capability of Slam-1 is its ability to be fine-tuned for specific contexts without model retraining or complex engineering, adapting to capture the terminology and nuances across various fields from healthcare to legal proceedings.

Performance Highlights:

  • 66% of human evaluators consistently preferred Slam-1 transcripts over our current Universal model and 72% of users preferred Slam-1 transcripts in blind tests over Deepgram’s Nova-3 model
  • 20% reduction in formatting errors
  • Up to 66% reduction in missed entities (names, places, custom terms) with customization
Comparison of Slam-1 and Universal in terms of WER and FWER.

Refer to our documentation for information about getting started and check out our blog post to learn more about Slam-1.

April 11, 2025

Dashboard Updates; Scaling Optimization; LeMUR bugfix

Introducing Dark Mode for our dashboard! Users can now switch between light and dark mode via a toggle in the top navigation bar.

Optimized scaling and capacity provisioning to more efficiently handle customer traffic.

Reduced LeMUR errors with targeted improvements to help alleviate scaling issues.

April 7, 2025

AssemblyAI is now PCI DSS v4.0 Compliant; bugfix

We've upgraded our PCI compliance to PCI DSS v4.0, ensuring our Speech-to-Text API meets the latest payment card industry security standards.

Added additional retry logic to reduce edge case authentication errors that would sometimes occur.

March 31, 2025

Dashboard Revamp

We have upgraded our dashboard—now with enhanced analytics and improved navigation to help you get more out of your AssemblyAI account. 

The new dashboard features:

  • Modern UI with improved navigation and streamlined onboarding
  • Enhanced analytics with usage and model-specific filtering
  • Advanced transcription history with filtering by date, ID, project, and API key
  • Dedicated rate limits section showing your account's limits for all endpoints
  • Clearer billing information with improved plan details and usage visualization

Our multiple API keys feature is fully integrated with the new dashboard, allowing you to better organize projects, and enhance security.

Log in to your AssemblyAI account today to experience the improved interface.

March 24, 2025

Speaker Labels bugfix

Reduced edge case errors with the Speaker Labels feature that could sometimes occur when the final utterance was a single word.

March 11, 2025

Multiple API Keys & Projects

We’ve introduced Multiple API Keys and Projects for AssemblyAI accounts. You can now create separate projects for development, staging, and production, making it easier to manage different environments. Within each project, you can set up multiple API keys and track detailed usage and spending metrics. All billing remains centralized while ensuring a clear separation between projects for better organization and control.

Easily manage different environments and streamline your workflow. Visit your dashboard to get started! 🚀

March 3, 2025

Update to List Endpoint

We’ve bifurcated our list endpoint into two separate endpoints - one for data processed on EU servers and one for data processed on US servers. Previously, the list endpoint returned transcripts from both regions. 

The US list endpoint is https://api.assemblyai.com/v2/transcript

The EU list endpoint is https://api.eu.assemblyai.com/v2/transcript 

When using these endpoints, transcripts are sorted from newest to oldest and can be retrieved for the last 90 days of usage. If you need to retrieve transcripts from more than 90 days ago please reach out to our Support team at support@assemblyai.com.

February 24, 2025

Universal improvements

Last week we delivered improvements to our October 2024 Universal release across latency, accuracy, and language coverage.

Universal demonstrates the lowest standard error rate when compared to leading models on the market for English, German, and Spanish:

Average word error rate (WER) across languages for several providers. WER is a canonical metric in speech-to-text that measures typical accuracy (lower is better). Descriptions of our evaluation sets can be found in our October release blog post.

Additionally, these improvements to accuracy are accompanied by significant increases in processing speed. Our latest Universal release achieves a 27.4% speedup in inference time for the vast majority of files (at the 95th percentile), enabling faster transcription at scale.

Additionally, these changes build on Universal's already best-in-class English performance to bring significant upgrades to last-mile challenges, meaning that Universal faithfully captures the fine details that make transcripts useable, like proper nouns, alphanumerics, and formatting.

Comparative error rates across speech recognition models, with lower values indicating better performance. Descriptions of our evaluation sets can be found in our October release blog post.

You can read our launch blog to learn more about these Universal updates.

February 14, 2025

Ukrainian support for Speaker Diarization

Our Speaker Diarization service now supports Ukrainian speech. This update enables automatic speaker labeling for Ukrainian audio files, making transcripts more readable and powering downstream features in multi-speaker contexts.

Here's how you can get started obtaining Ukrainian speaker labels using our Python SDK:

import assemblyai as aai

aai.settings.api_key = "<YOUR_API_KEY>"
audio_file = "/path/to/your/file"

config = aai.TranscriptionConfig(
  speaker_labels=True,
  language_code="uk"
)

transcript = aai.Transcriber().transcribe(audio_file, config)

for utterance in transcript.utterances:
  print(f"Speaker {utterance.speaker}: {utterance.text}")

Check out our Docs for more information.

February 11, 2025

Claude 2 sunset

As previously announced, we sunset Claude 2 and Claude 2.1 for LeMUR on February 6th.

If you were previously using these models, we recommended switching to Claude 3.5 Sonnet, which is both more performant and less expensive than Claude 2. You can do so via the final_model parameter in LeMUR requests. Additionally, this parameter is now required.

Additionally, we have sunset the lemur/v3/generate/action-items endpoint.

February 10, 2025

Reduced hallucination rates; Bugfix

We have reduced Universal-2's hallucination rate for the string "sa" during periods of silence.

We have fixed a rare bug in our Speaker Labels service that would occasionally cause requests to fail and return a server error.

February 5, 2025

Multichannel audio trim fix

We've fixed an issue which caused the audio_start_from and audio_end_at parameters to not be respected for multichannel audio.

February 3, 2025

Platform enhancements and security updates

🌍 Simplified EU Data Residency & Management

We've simplified EU operations with instant access to:

  • Self-serve EU data processing via our EU endpointComplete data sovereignty for EU operations
  • Regional usage filtering and cost tracking
  • Reduced latency for EU-based operations

✅ Enhanced Security & Compliance

  • Full-scope SOC 2 Type 2 certification across all Trust Service Criteria
  • ISO 27001 certification achievement
  • Enhanced security controls across our infrastructure

You can read more about these new enhancements in our related blog.

January 31, 2025

Reduced hallucination rates

We have reduced Universal-2's hallucination rate for the word "it" during periods of silence.

January 15, 2025

New dashboard features

Two new features are available to users on their dashboards:

  1. Users can now see and filter more historical usage and spend data
  2. Users can now see usage and spend by the hour for a given day
December 20, 2024

Reliability improvements

We've made reliability improvements for Claude models in our LeMUR framework.

We've made adjustments to our infrastructure so that users should see fewer timeout errors when using our Nano tier with some languages.

December 19, 2024

LiveKit 🤝 AssemblyAI

We've released the AssemblyAI integration for the LiveKit Agents framework, allowing developers to use our Streaming Speech-to-Text model in their real-time LiveKit applications.

LiveKit is a powerful platform for building real-time audio and video applications. It abstracts away the complicated details of building real-time applications so developers can rapidly build and deploy applications for video conferencing, livestreaming, and more.

Check out our tutorial on How to build a LiveKit app with real-time Speech-to-Text to see how you can build a real-time transcription chat feature using the integration. You can browse all of our integrations on the Integrations page of our Docs.