Model selection

The speech_model connection parameter lets you specify which model to use for streaming transcription.

Universal-3-Pro: Testing only

The Universal-3-Pro streaming model is currently available for testing purposes only. It is not yet recommended for production workloads as we are currently scaling out our infrastructure.

Available models

NameParameterDescription
Universal-Streaming English (default)"speech_model": "universal-streaming-english"Our low-latency streaming model optimized for real-time English transcription.
Native Code Switching"speech_model": "universal-streaming-multilingual"Our low-latency streaming model optimized for real-time multilingual transcription in English, Spanish, French, German, Italian, and Portuguese.
Universal-3-Pro (testing)"speech_model": "u3-pro"Our highest accuracy model with native multilingual code switching, entity accuracy, performance across varying audio, and prompting support.

Choosing a model

FeatureUniversal-Streaming EnglishNative Code SwitchingUniversal-3-Pro Streaming
LatencyFastestFastFast
Partial transcriptsYesYesYes
MultilingualNoPer TurnNative Code Switching
Entity accuracyOkayOkayBest
Disfluencies & filler wordsNoNoYes
CustomizationKeyterms prompting (known context)Keyterms prompting (known context)Keyterms prompting (known context) + Native prompting (unknown context)

For detailed setup and configuration of Universal-3-Pro streaming, see the Universal-3 Pro page. For prompting guidance, see the Prompting guide.

End-to-end example

You can select a model by setting the speech_model connection parameter when connecting to the streaming API:

1import logging
2from typing import Type
3
4import assemblyai as aai
5from assemblyai.streaming.v3 import (
6 BeginEvent,
7 StreamingClient,
8 StreamingClientOptions,
9 StreamingError,
10 StreamingEvents,
11 StreamingParameters,
12 TurnEvent,
13 TerminationEvent,
14)
15
16api_key = "<YOUR_API_KEY>"
17
18logging.basicConfig(level=logging.INFO)
19logger = logging.getLogger(__name__)
20
21def on_begin(self: Type[StreamingClient], event: BeginEvent):
22 print(f"Session started: {event.id}")
23
24def on_turn(self: Type[StreamingClient], event: TurnEvent):
25 print(f"{event.transcript} ({event.end_of_turn})")
26
27def on_terminated(self: Type[StreamingClient], event: TerminationEvent):
28 print(
29 f"Session terminated: {event.audio_duration_seconds} seconds of audio processed"
30 )
31
32def on_error(self: Type[StreamingClient], error: StreamingError):
33 print(f"Error occurred: {error}")
34
35def main():
36 client = StreamingClient(
37 StreamingClientOptions(
38 api_key=api_key,
39 api_host="streaming.assemblyai.com",
40 )
41 )
42
43 client.on(StreamingEvents.Begin, on_begin)
44 client.on(StreamingEvents.Turn, on_turn)
45 client.on(StreamingEvents.Termination, on_terminated)
46 client.on(StreamingEvents.Error, on_error)
47
48 client.connect(
49 StreamingParameters(
50 sample_rate=16000,
51 speech_model="u3-pro", # or "universal-streaming-english", "universal-streaming-multilingual"
52 min_end_of_turn_silence_when_confident=100,
53 max_turn_silence=1200,
54 # format_turns=True, # Whether to return formatted final transcripts (not applicable to u3-pro)
55 )
56 )
57
58 try:
59 client.stream(
60 aai.extras.MicrophoneStream(sample_rate=16000)
61 )
62 finally:
63 client.disconnect(terminate=True)
64
65if __name__ == "__main__":
66 main()