If you’re already using Universal Streaming, you can quickly test Universal-3 Pro Streaming by switching the speech_model parameter to "u3-rt-pro" and removing format_turns (formatting is always on in U3 Pro). Just update the connection params and start streaming.
# Before (Universal Streaming)CONNECTION_PARAMS = { "sample_rate": 16000, "format_turns": True,}# After (Universal-3 Pro Streaming)CONNECTION_PARAMS = { "sample_rate": 16000, "speech_model": "u3-rt-pro",}
That’s it for a quick test. But there are important behavioral differences
in turn detection, partials, and formatting that may require updates to your
message handling logic. Read on for the full migration details.
400 ms (minimum silence before checking confidence)
100 ms (silence before speculative EOT check)
Review and adjust if you tuned this value
max_turn_silence
1280 ms
1000 ms
Review and adjust if you tuned this value
end_of_turn / turn_is_formatted
The model has built in formatting so turn_is_formatted is true on all turns including partials — do not use it as a turn-end signal. Use end_of_turn: true to detect when a turn has completed.
Always the same value — one end-of-turn transcript per turn, always formatted
Simplify: just check end_of_turn: true for the final formatted transcript
Partials
Emitted frequently during speech (unformatted on English model, formatted on multilingual model)
Early partial at ~750ms, silence-based partials, plus continuous partials every ~3s during long turns (continuous_partials enabled by default)
Expect stable, fully-transcribed partials rather than word-by-word updates
prompt
Not supported
Supported — contextual prompting (describe the audio)
New capability (optional)
keyterms_prompt
Supported (connection-time only; not updatable mid-stream)
Supported; can be used together with prompt; updatable mid-stream
No change needed; new: can combine with prompt and update via UpdateConfiguration
UpdateConfiguration
Turn detection params only (end_of_turn_confidence_threshold, min_turn_silence, max_turn_silence)
This is the most significant behavioral difference between the two models.Universal Streaming uses a confidence-based system combining semantic and acoustic detection (source):
Parameter
Default
Description
end_of_turn_confidence_threshold
0.4
Confidence threshold (0.0-1.0) to trigger end of turn (officially deprecated)
min_turn_silence
400 ms
Minimum silence before checking confidence
max_turn_silence
1280 ms
Maximum silence before forcing end of turn
The model evaluates end_of_turn_confidence during silence. If the score exceeds end_of_turn_confidence_threshold after min_turn_silence, the turn ends. Otherwise, the turn is forced to end after max_turn_silence.Universal-3 Pro uses a punctuation-based system (source):
Parameter
Default
Description
min_turn_silence
100 ms
Silence before a speculative end-of-turn check fires
max_turn_silence
1000 ms
Maximum silence before a turn is forced to end
When silence reaches min_turn_silence, the model transcribes the audio and checks for terminal punctuation (.?!):
Terminal punctuation found — the turn ends (end_of_turn: true)
No terminal punctuation — a partial is emitted (end_of_turn: false) and the turn continues
Silence reaches max_turn_silence — the turn is forced to end regardless of punctuation
end_of_turn_confidence_threshold does not exist on Universal-3 Pro (it
was never part of the U3 Pro API — not deprecated, just absent). It is
officially deprecated on Universal Streaming. Remove this parameter and
configure min_turn_silence and max_turn_silence instead. For configuration
guidance, see Configuring Turn
Detection.
Universal-3 Pro supports a prompt parameter for contextual prompting — a natural-language description of what the audio is about (domain, scenario, or full details). Transcription behavior itself is built in and optimized for streaming and turn detection. See the Prompting Guide for details.
CONNECTION_PARAMS = { "sample_rate": 16000, "speech_model": "u3-rt-pro", "prompt": "Customer support call about an internet service outage.",}
Start with no prompt. Universal-3 Pro is optimized out of the box. Add
context when domain-specific vocabulary is being misrecognized, starting with
the broadest description that fits your use case.
prompt and keyterms_prompt can be used together. Use prompt to
describe the conversation and keyterms_prompt to enumerate the specific
terms that matter — they are complementary.
Update prompt, keyterms_prompt, min_turn_silence, and max_turn_silence during an active session without reconnecting. See Updating configuration mid-stream for details.
ForceEndpoint is supported on both Universal Streaming and Universal-3 Pro — no migration changes needed. Force the current turn to end immediately based on external signals. See Forcing a turn endpoint for details.
Universal Streaming transcribes English by default. For multilingual support, use speech_model: "universal-streaming-multilingual". (Source)Universal-3 Pro natively code-switches between 6 languages in a single model — no separate multilingual model needed: English, Spanish, German, French, Portuguese, Italian. It also supports automatic language detection, returning language_code and language_confidence fields in Turn messages. To bias toward a specific language, pass the language_code connection parameter (Language selection). See Supported languages for the full list.Language Detection: Universal Streaming supports the language_detection connection parameter (true/false, default false) with the multilingual model. When enabled, Turn messages include language_code and language_confidence fields. Universal-3 Pro also supports language detection with code-switching — see Supported languages for details.