Terminate Streaming Session After Inactivity
An often-overlooked aspect of implementing AssemblyAI’s Streaming Speech-to-Text (STT) service is efficiently terminating transcription sessions. In this cookbook, you will learn how to terminate a Streaming session after any fixed duration of silence.
For the full code, refer to this GitHub gist.
Quickstart
Get Started
Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for an AssemblyAI account and get your API key from your dashboard.
Step-by-step instructions
First, install AssemblyAI’s Python SDK.
Handling inactivity
Empty transcripts
As long as a session is open, our Streaming STT service will continue sending empty PartialTranscript
s that look like this:
Message 1:
Message 2:
Thus, we can use empty partial transcripts to assume that the user has stopped speaking.
Note: Other keys in the payload have been omitted for brevity but can be seen here in our Streaming API Reference.
Implementing Partial Transcript Checks
Let’s consider a code example to track if the PartialTranscript
s have been empty for a duration of time.
Define your Streaming functions as per normal.
Then, define the constant last_transcript_received = datetime.now()
, and set a flag terminated
to be False
.
We will use these variables later on.
Next, define your on_data
function:
- Access the global variable
last_transcript_received
, as well asterminated
- If the Streaming STT transcriber has been terminated, don’t return anything.
- If
transcript.text
is empty, check if it has been 5 seconds since the last empty transcript. Whentrue
, terminate the transcriber. - Else, just print the text in our terminal as per usual, and set the time of the last transcript received to now.
Lastly, we define our on_close
and terminate_transcription
function. on_close
simply sets terminated
to true
when the WebSocket connection closes.
terminate_transcription
just accesses the global transcriber and closes the session when the function is called by on_data
.
Create your Streaming STT transcriber and start your transcription.
What you should observe is that transcription works in real-time and automatically terminates after 5 seconds!