Documentation Index Fetch the complete documentation index at: https://assemblyai.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Supported languages
Universal-3 Pro Streaming supports 6 languages with native multilingual code switching. The model automatically detects and switches between languages mid-stream.
Need more than 6 languages? If you need support beyond the 6 languages listed here, consider using the
Whisper Streaming model (speech_model: "whisper-rt"), which supports
99 languages with automatic language detection. See the Whisper
Streaming page for details.
Regional dialects and variants
Universal-3 Pro Streaming goes beyond standard language support with deep understanding of regional dialects and local variants. Whether your audio features Quebecois French, Mexican Spanish, or Brazilian Portuguese, the model accurately captures speech as it’s naturally spoken — including colloquial expressions, local vocabulary, and accent-specific pronunciation patterns.
Dialect support You do not need to specify a dialect code to get accurate dialect transcription. Universal-3 Pro automatically recognizes regional speech patterns when using the base language code (e.g., fr for all French dialects, es for all Spanish dialects).
English dialects and variants
Dialect / Variant Description American English Standard US English, including regional variants (Southern, Midwestern, Northeastern) British English UK English, including Received Pronunciation and regional accents Australian English Australian English with local expressions and pronunciation
Spanish dialects and variants
Dialect / Variant Description Castilian Spanish Standard Peninsular Spanish as spoken in central and northern Spain Mexican Spanish Mexican Spanish with local vocabulary and pronunciation Argentine Spanish Rioplatense Spanish with distinctive voseo and pronunciation Colombian Spanish Colombian Spanish with regional speech patterns Chilean Spanish Chilean Spanish with rapid speech patterns and local slang Caribbean Spanish Cuban, Dominican, and Puerto Rican Spanish dialects Spanglish English-Spanish code-mixing common in US bilingual communities
French dialects and variants
Dialect / Variant Description Metropolitan French Standard Parisian French Canadian French (Quebecois) Quebec French with distinctive vocabulary, pronunciation, and expressions Belgian French Belgian French with local vocabulary and pronunciation
Portuguese dialects and variants
Dialect / Variant Description Brazilian Portuguese Brazilian Portuguese with local vocabulary, pronunciation, and expressions European Portuguese Standard Lisbon Portuguese with Iberian pronunciation
Italian dialects and variants
Dialect / Variant Description Standard Italian Standard Italian based on Tuscan-influenced speech
Configuration
Prompting can be used to guide the model toward a specific language. However, prompts should be thoroughly tested before use in production. For example, pre-pending Transcribe Spanish to your prompt has shown to perform well on Spanish audio. See the Prompting guide for more details.
Quickstart
The following examples demonstrate how to stream audio to Universal-3 Pro Streaming.
Python SDK
Python
JavaScript SDK
JavaScript
Install the required libraries pip install "assemblyai>=0.54.0" pyaudio
Create a new file main.py and paste the code below. Replace <YOUR_API_KEY> with your API key.
Run with python main.py and speak into your microphone.
import logging
from typing import Type
import assemblyai as aai
from assemblyai.streaming.v3 import (
BeginEvent,
StreamingClient,
StreamingClientOptions,
StreamingError,
StreamingEvents,
StreamingParameters,
TurnEvent,
TerminationEvent,
)
api_key = "<YOUR_API_KEY>"
logging.basicConfig( level = logging. INFO )
logger = logging.getLogger( __name__ )
def on_begin ( self : Type[StreamingClient], event : BeginEvent):
print ( f "Session started: { event.id } " )
def on_turn ( self : Type[StreamingClient], event : TurnEvent):
print ( f " { event.transcript } ( { event.end_of_turn } )" )
def on_terminated ( self : Type[StreamingClient], event : TerminationEvent):
print (
f "Session terminated: { event.audio_duration_seconds } seconds of audio processed"
)
def on_error ( self : Type[StreamingClient], error : StreamingError):
print ( f "Error occurred: { error } " )
def main ():
client = StreamingClient(
StreamingClientOptions(
api_key = api_key,
api_host = "streaming.assemblyai.com" ,
)
)
client.on(StreamingEvents.Begin, on_begin)
client.on(StreamingEvents.Turn, on_turn)
client.on(StreamingEvents.Termination, on_terminated)
client.on(StreamingEvents.Error, on_error)
client.connect(
StreamingParameters(
sample_rate = 16000 ,
speech_model = "u3-rt-pro" ,
)
)
try :
client.stream(
aai.extras.MicrophoneStream( sample_rate = 16000 )
)
finally :
client.disconnect( terminate = True )
if __name__ == "__main__" :
main()
See all 63 lines
Install the required libraries pip install websocket-client pyaudio
Create a new file main.py and paste the code below. Replace <YOUR_API_KEY> with your API key.
Run with python main.py and speak into your microphone.
import pyaudio
import websocket
import json
import threading
import time
from urllib.parse import urlencode
YOUR_API_KEY = "<YOUR_API_KEY>"
CONNECTION_PARAMS = {
"sample_rate" : 16000 ,
"speech_model" : "u3-rt-pro" ,
}
API_ENDPOINT_BASE_URL = "wss://streaming.assemblyai.com/v3/ws"
API_ENDPOINT = f " { API_ENDPOINT_BASE_URL } ? { urlencode( CONNECTION_PARAMS ) } "
FRAMES_PER_BUFFER = 800
SAMPLE_RATE = CONNECTION_PARAMS [ "sample_rate" ]
CHANNELS = 1
FORMAT = pyaudio.paInt16
audio = None
stream = None
ws_app = None
audio_thread = None
stop_event = threading.Event()
def on_open ( ws ):
print ( "WebSocket connection opened." )
def stream_audio ():
global stream
while not stop_event.is_set():
try :
audio_data = stream.read( FRAMES_PER_BUFFER , exception_on_overflow = False )
ws.send(audio_data, websocket. ABNF . OPCODE_BINARY )
except Exception as e:
print ( f "Error streaming audio: { e } " )
break
global audio_thread
audio_thread = threading.Thread( target = stream_audio)
audio_thread.daemon = True
audio_thread.start()
def on_message ( ws , message ):
try :
data = json.loads(message)
msg_type = data.get( "type" )
if msg_type == "Begin" :
print ( f "Session began: ID= { data.get( 'id' ) } " )
elif msg_type == "Turn" :
transcript = data.get( "transcript" , "" )
end_of_turn = data.get( "end_of_turn" , False )
if end_of_turn:
print ( f " \r { ' ' * 80 } \r { transcript } " )
else :
print ( f " \r { transcript } " , end = "" )
elif msg_type == "Termination" :
print ( f " \n Session terminated: { data.get( 'audio_duration_seconds' , 0 ) } s of audio" )
except Exception as e:
print ( f "Error handling message: { e } " )
def on_error ( ws , error ):
print ( f " \n WebSocket Error: { error } " )
stop_event.set()
def on_close ( ws , close_status_code , close_msg ):
print ( f " \n WebSocket Disconnected: Status= { close_status_code } " )
global stream, audio
stop_event.set()
if stream:
if stream.is_active():
stream.stop_stream()
stream.close()
if audio:
audio.terminate()
def run ():
global audio, stream, ws_app
audio = pyaudio.PyAudio()
stream = audio.open(
input = True ,
frames_per_buffer = FRAMES_PER_BUFFER ,
channels = CHANNELS ,
format = FORMAT ,
rate = SAMPLE_RATE ,
)
print ( "Speak into your microphone. Press Ctrl+C to stop." )
ws_app = websocket.WebSocketApp(
API_ENDPOINT ,
header = { "Authorization" : YOUR_API_KEY },
on_open = on_open,
on_message = on_message,
on_error = on_error,
on_close = on_close,
)
ws_thread = threading.Thread( target = ws_app.run_forever)
ws_thread.daemon = True
ws_thread.start()
try :
while ws_thread.is_alive():
time.sleep( 0.1 )
except KeyboardInterrupt :
print ( " \n Stopping..." )
stop_event.set()
if ws_app and ws_app.sock and ws_app.sock.connected:
ws_app.send(json.dumps({ "type" : "Terminate" }))
time.sleep( 2 )
if ws_app:
ws_app.close()
ws_thread.join( timeout = 2.0 )
if __name__ == "__main__" :
run()
See all 119 lines
Install the required libraries npm install assemblyai node-record-lpcm16
The module node-record-lpcm16 requires SoX and it must be available in your $PATH. For Mac OS: For most linux distros: sudo apt-get install sox libsox-fmt-all
For Windows: download the binaries
Create a new file main.js and paste the code below. Replace <YOUR_API_KEY> with your API key.
Run with node main.js and speak into your microphone.
import { Readable } from "stream" ;
import { AssemblyAI } from "assemblyai" ;
import recorder from "node-record-lpcm16" ;
const run = async () => {
const client = new AssemblyAI ({
apiKey: "<YOUR_API_KEY>" ,
});
const transcriber = client . streaming . transcriber ({
sampleRate: 16_000 ,
speechModel: "u3-rt-pro" ,
});
transcriber . on ( "open" , ({ id }) => {
console . log ( `Session opened with ID: ${ id } ` );
});
transcriber . on ( "error" , ( error ) => {
console . error ( "Error:" , error );
});
transcriber . on ( "close" , ( code , reason ) =>
console . log ( "Session closed:" , code , reason )
);
transcriber . on ( "turn" , ( turn ) => {
if ( ! turn . transcript ) {
return ;
}
console . log ( "Turn:" , turn . transcript );
});
try {
console . log ( "Connecting to streaming transcript service" );
await transcriber . connect ();
console . log ( "Starting recording" );
const recording = recorder . record ({
channels: 1 ,
sampleRate: 16_000 ,
audioType: "wav" ,
});
Readable . toWeb ( recording . stream ()). pipeTo ( transcriber . stream ());
process . on ( "SIGINT" , async function () {
console . log ();
console . log ( "Stopping recording" );
recording . stop ();
console . log ( "Closing streaming transcript connection" );
await transcriber . close ();
process . exit ();
});
} catch ( error ) {
console . error ( error );
}
};
run ();
See all 63 lines
Install the required libraries
Create a new file main.js and paste the code below. Replace <YOUR_API_KEY> with your API key.
Run with node main.js and speak into your microphone.
const WebSocket = require ( "ws" );
const mic = require ( "mic" );
const querystring = require ( "querystring" );
const YOUR_API_KEY = "<YOUR_API_KEY>" ;
const CONNECTION_PARAMS = {
sample_rate: 16000 ,
speech_model: "u3-rt-pro" ,
};
const API_ENDPOINT_BASE_URL = "wss://streaming.assemblyai.com/v3/ws" ;
const API_ENDPOINT = ` ${ API_ENDPOINT_BASE_URL } ? ${ querystring . stringify ( CONNECTION_PARAMS ) } ` ;
const SAMPLE_RATE = CONNECTION_PARAMS . sample_rate ;
let micInstance = null ;
let ws = null ;
function run () {
console . log ( "Starting AssemblyAI streaming transcription..." );
ws = new WebSocket ( API_ENDPOINT , {
headers: { Authorization: YOUR_API_KEY },
});
ws . on ( "open" , () => {
console . log ( "WebSocket connection opened." );
micInstance = mic ({
rate: String ( SAMPLE_RATE ),
channels: "1" ,
bitwidth: "16" ,
encoding: "signed-integer" ,
endian: "little" ,
});
const micInputStream = micInstance . getAudioStream ();
micInputStream . on ( "data" , ( data ) => {
if ( ws . readyState === WebSocket . OPEN ) {
ws . send ( data );
}
});
micInstance . start ();
console . log ( "Speak into your microphone. Press Ctrl+C to stop." );
});
ws . on ( "message" , ( data ) => {
try {
const msg = JSON . parse ( data );
if ( msg . type === "Begin" ) {
console . log ( `Session began: ID= ${ msg . id } ` );
} else if ( msg . type === "Turn" ) {
const transcript = msg . transcript || "" ;
if ( msg . end_of_turn ) {
process . stdout . write ( " \r " + " " . repeat ( 80 ) + " \r " );
console . log ( transcript );
} else {
process . stdout . write ( ` \r ${ transcript } ` );
}
} else if ( msg . type === "Termination" ) {
console . log (
` \n Session terminated: ${ msg . audio_duration_seconds } s of audio`
);
}
} catch ( e ) {
console . error ( "Error parsing message:" , e );
}
});
ws . on ( "error" , ( error ) => {
console . error ( "WebSocket error:" , error );
});
ws . on ( "close" , ( code , reason ) => {
console . log ( `WebSocket closed: ${ code } ` );
if ( micInstance ) micInstance . stop ();
});
process . on ( "SIGINT" , () => {
console . log ( " \n Stopping..." );
if ( micInstance ) micInstance . stop ();
if ( ws && ws . readyState === WebSocket . OPEN ) {
ws . send ( JSON . stringify ({ type: "Terminate" }));
setTimeout (() => ws . close (), 2000 );
}
});
}
run ();
See all 89 lines