The speech_models parameter lets you specify which model to use for transcription. You can provide multiple models in priority order, and our system will automatically route to the best available model based on your request.
You must include the speech_models parameter in every pre-recorded transcription request. There is no default model. If you omit speech_models, the request will fail.
Model routing behavior: The system attempts to use the models in priority order falling back to the next model when needed. For example, with ["universal-3-pro", "universal-2"], the system will try to use universal-3-pro for languages it supports (English, Spanish, Portuguese, French, German, and Italian), and automatically fall back to Universal for all other languages. This ensures you get the best performing transcription where available while maintaining the widest language coverage.
We recommend Universal-3 Pro for pre-recorded audio transcription. It delivers the highest accuracy and fastest transcription out of the box, with optional prompting for when you need more control. For the broadest language coverage (99 languages), use ["universal-3-pro", "universal-2"] to automatically fall back to Universal-2 for unsupported languages.
You can change the model by setting the speech_models in the POST request body:
After transcription completes, you can check which model was actually used to process your request by reading the speech_model_used field. This is useful when you provide multiple models in the speech_models array, as the system may fall back to a different model depending on language support.
Here is the full working code that demonstrates model selection with error handling: