Multilingual streaming
Supported languages
English, Spanish, French, German, Italian, and Portuguese
Multilingual streaming allows you to transcribe audio streams in multiple languages.
Configuration
Keyterms prompting is not supported with multilingual streaming.
To utilize multilingual streaming, you need to include "speech_model":"universal-streaming-multilingual" as a query parameter in the WebSocket URL.
Supported languages
Multilingual currently supports: English, Spanish, French, German, Italian, and Portuguese.
Language detection
The multilingual streaming model supports automatic language detection, allowing you to identify which language is being spoken in real-time. When enabled, the model returns the detected language code and confidence score with each complete utterance.
Configuration
To enable language detection, include language_detection=true as a query parameter in the WebSocket URL:
Output format
When language detection is enabled, each Turn message with a complete utterance will include two additional fields:
language_code: The language code of the detected language (e.g.,"es"for Spanish,"fr"for French)language_confidence: A confidence score between 0 and 1 indicating how confident the model is in the language detection
The language_code and language_confidence fields only appear when the utterance field is non-empty and contains a complete utterance.
Example response
Here’s an example Turn message with language detection enabled, showing Spanish being detected:
In this example, the model detected Spanish ("es") with a confidence of 0.999997.
Understanding formatting
The multilingual model produces transcripts with punctuation and capitalization already built into the model outputs. This means you’ll receive properly formatted text without requiring any additional post-processing.
While the API still returns the turn_is_formatted parameter to maintain interface consistency with other streaming models, the multilingual model doesn’t perform additional formatting operations. All transcripts from the multilingual model are already formatted as they’re generated.
In the future, this built-in formatting capability will be extended to our English-only streaming model as well.
Quickstart
Python
Javascript
Firstly, install the required dependencies.