Since day one, AssemblyAI has been focused on serving the growing needs of thousands of developers and organizations who build on our API every day. With this in mind, we launched new offerings‚ including a price reduction and latency improvements, new integrations, and our next-gen Universal-1 model in the last year alone.
To help you balance cost, speed, and accuracy as you build, we now offer two new Speech-to-Text tiers: Best and Nano. These tiers offer different levels of power, capabilities and price points.
The Best and Nano tiers make it easy to choose the level of Speech-to-Text precision that your use case requires. You are never locked into using one specific tier when building with our Speech AI models, which gives you the opportunity to build at scale and access different price points depending on what you’re developing.
How to Choose the Right Speech-to-Text Tier
The Best and Nano tiers both provide powerful Speech AI capabilities, but provide different functionality and price points depending on your budget constraints and accuracy requirements. Here's a quick guide to help you decide when to use each tier:
Best tier (Default for customers): The Best tier is our most accurate and advanced Speech-to-Text offering and is the default tier for customers. The Best tier is well suited for those that need to capture the nuances of voice data and complex audio files that have noisy backgrounds, multiple speakers, or accented speech. (Note: You can change your default tier in the API.)
Our Best tier houses our most powerful Speech AI models, including Universal-1’s cutting-edge accuracy and capabilities. You can see a full breakdown of Universal-1’s performance here.
Nano tier: This tier is our newest offering that provides high-quality Speech-to-Text at an accessible price point. The Nano tier works well for content generation, topic detection, and more, and is ideal for those that require cost efficiency or want to test out Speech-to-Text models in a low-cost way.
Compare AssemblyAI Speech-to-Text Tiers
Best tier (Default) | Nano tier |
---|---|
Best-in-class/premium, highest accuracy | Combination of accuracy & speed |
Starts at $0.37/hour | Starts at $0.12/hour |
Suggested use cases:
|
Suggested use cases:
|
No two use cases are the same, so we’ve intentionally made it easy to try both of our Speech-to-Text tiers yourself. You can determine the best tier for your needs directly in our API or in the Playground.
How to Select Best or Nano tier in the API
Follow these instructions to change your default tier in the API with one line of code in your transcription request.
Change the tier by setting the speech_model
parameter in the transcription config:
If you do not set the speech_model
parameter explicitly, it will default to Best. If you have any questions, visit our Docs or reach out to Customer Support at support@assemblyai.com.
Build Applications Your Way
The Best and Nano tiers let you find the perfect fit for your application—whether you need the accuracy and scalability of Best or the cost-effective capabilities of Nano.