Terminate Streaming Session After Inactivity

An often-overlooked aspect of implementing AssemblyAI’s Streaming Speech-to-Text (STT) service is efficiently terminating transcription sessions. In this cookbook, you will learn how to terminate a Streaming session after any fixed duration of silence.

For the full code, refer to this GitHub gist.

Quickstart

1import logging
2from datetime import datetime
3from typing import Type
4
5import assemblyai as aai
6from assemblyai.streaming.v3 import (
7 BeginEvent,
8 StreamingClient,
9 StreamingClientOptions,
10 StreamingError,
11 StreamingEvents,
12 StreamingParameters,
13 TerminationEvent,
14 TurnEvent,
15)
16
17api_key = "<YOUR_API_KEY>"
18
19logging.basicConfig(level=logging.INFO)
20logger = logging.getLogger(__name__)
21
22last_transcript_received = datetime.now()
23terminated = False
24
25
26def on_begin(self: Type[StreamingClient], event: BeginEvent):
27 print(f"Session started: {event.id}")
28
29
30def on_turn(self: Type[StreamingClient], event: TurnEvent):
31 global last_transcript_received, terminated
32
33 if terminated:
34 return
35
36 print(f"{event.transcript} ({event.end_of_turn})")
37
38 if event.transcript.strip():
39 last_transcript_received = datetime.now()
40
41 silence_duration = (datetime.now() - last_transcript_received).total_seconds()
42 if silence_duration > 5:
43 print("No transcription received in 5 seconds. Terminating session...")
44 self.disconnect(terminate=True)
45 terminated = True
46 return
47
48def on_terminated(self: Type[StreamingClient], event: TerminationEvent):
49 print(f"Session terminated after {event.audio_duration_seconds:.2f} seconds")
50
51
52def on_error(self: Type[StreamingClient], error: StreamingError):
53 print(f"Error occurred: {error}")
54
55
56def main():
57 client = StreamingClient(
58 StreamingClientOptions(
59 api_key=api_key,
60 api_host="streaming.assemblyai.com",
61 )
62 )
63
64 client.on(StreamingEvents.Begin, on_begin)
65 client.on(StreamingEvents.Turn, on_turn)
66 client.on(StreamingEvents.Termination, on_terminated)
67 client.on(StreamingEvents.Error, on_error)
68
69 client.connect(
70 StreamingParameters(
71 sample_rate=16000,
72 )
73 )
74
75 try:
76 client.stream(aai.extras.MicrophoneStream(sample_rate=16000))
77 finally:
78 if not terminated:
79 client.disconnect(terminate=True)
80
81
82if __name__ == "__main__":
83 main()

Get Started

Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for an AssemblyAI account and get your API key from your dashboard.

Step-by-step instructions

First, install AssemblyAI’s Python SDK.

$pip install assemblyai
1ickstart
2import logging
3from datetime import datetime
4from typing import Type
5import assemblyai as aai
6from assemblyai.streaming.v3 import (
7 BeginEvent,
8 StreamingClient,
9 StreamingClientOptions,
10 StreamingError,
11 StreamingEvents,
12 StreamingParameters,
13 TerminationEvent,
14 TurnEvent,
15)
16api_key = "<YOUR_API_KEY>"

Implementing Speech Activity Checks

Our Streaming API emits a Turn Event each time speech is processed. During periods of silence, no TurnEvent will be sent. You can use this behavior to detect inactivity and automatically terminate the session.

We can track the timestamp of the most recent non-empty transcript using a datetime. On every Turn Event, we:

  • Update the timestamp if meaningful speech is received

  • Check how many seconds have passed since the last valid transcript

  • If that exceeds your timeout (e.g. 5 seconds), terminate the session

Key Variables

1last_transcript_received = datetime.now()
2terminated = False

These are updated on every turn event.

Turn event logic

1def on_turn(self: Type[StreamingClient], event: TurnEvent):
2 global last_transcript_received, terminated
3
4 if terminated:
5 return
6
7 print(f"{event.transcript} ({event.end_of_turn})")
8
9 if event.transcript.strip():
10 last_transcript_received = datetime.now()
11
12 silence_duration = (datetime.now() - last_transcript_received).total_seconds()
13 if silence_duration > 5:
14 print("No transcription received in 5 seconds. Terminating session...")
15 self.disconnect(terminate=True)
16 terminated = True
17 return

This pattern ensures sessions are cleanly terminated after inactivity.

What You’ll Observe

  • Live transcription continues as long as there’s speech

  • After 5 seconds of silence, the session ends automatically

You can change the timeout value to suit your needs by modifying the silence_duration > 5 check.