Keyterms Prompting | AssemblyAI

The keyterms prompting feature helps improve recognition accuracy for specific words and phrases that are important to your use case. Keyterms prompting is supported for both the English and multilingual streaming models.

Keyterms Prompting costs an additional $0.04/hour.

Quickstart

Python SDK

Python

JavaScript SDK

Javascript

Firstly, install the required dependencies.

$ pip install assemblyai

Python SDK

Python

Javascript

JavaScript SDK

1 import logging
2 from typing import Type
3 
4 import assemblyai as aai
5 from assemblyai.streaming.v3 import (
6     BeginEvent,
7     StreamingClient,
8     StreamingClientOptions,
9     StreamingError,
10     StreamingEvents,
11     StreamingParameters,
12     TerminationEvent,
13     TurnEvent,
14 )
15 
16 api_key = "<YOUR_API_KEY>"
17 
18 logging.basicConfig(level=logging.INFO)
19 logger = logging.getLogger(__name__)
20 
21 
22 def on_begin(self: Type[StreamingClient], event: BeginEvent):
23     print(f"Session started: {event.id}")
24 
25 
26 def on_turn(self: Type[StreamingClient], event: TurnEvent):
27     if event.turn_is_formatted:
28         # Clear the line and print formatted final transcript on new line
29         print(f"\r{' ' * 100}\r{event.transcript}")
30     else:
31         # Overwrite current line with partial unformatted transcript
32         print(f"\r{event.transcript}", end='', flush=True)
33 
34 
35 def on_terminated(self: Type[StreamingClient], event: TerminationEvent):
36     print(
37         f"Session terminated: {event.audio_duration_seconds} seconds of audio processed"
38     )
39 
40 
41 def on_error(self: Type[StreamingClient], error: StreamingError):
42     print(f"Error occurred: {error}")
43 
44 
45 def main():
46     client = StreamingClient(
47         StreamingClientOptions(
48             api_key=api_key,
49             api_host="streaming.assemblyai.com",
50         )
51     )
52 
53     client.on(StreamingEvents.Begin, on_begin)
54     client.on(StreamingEvents.Turn, on_turn)
55     client.on(StreamingEvents.Termination, on_terminated)
56     client.on(StreamingEvents.Error, on_error)
57 
58     client.connect(
59         StreamingParameters(
60             sample_rate=16000,
61             format_turns=True,
62             keyterms_prompt=["Keanu Reeves", "AssemblyAI", "Universal-2"],
63         )
64     )
65 
66     try:
67         client.stream(
68           aai.extras.MicrophoneStream(sample_rate=16000)
69         )
70     finally:
71         client.disconnect(terminate=True)
72 
73 
74 if __name__ == "__main__":
75     main()

Configuration

To utilize keyterms prompting, you need to include your desired keyterms as query parameters in the WebSocket URL.

You can include a maximum of 100 keyterms per session.
Each individual keyterm string must be 50 characters or less in length.

How it works

Streaming Keyterms Prompting has two components to improve accuracy for your terms.

Word-level boosting

The streaming model itself is biased during inference to be more accurate at identifying words from your keyterms list. This happens in real-time as words are emitted during the streaming process, providing immediate improvements to recognition accuracy. This component is enabled by default.

Turn-level boosting

After each turn is completed, an additional boosting pass analyzes the full transcript using your keyterms list. This post-processing step, similar to formatting, provides a second layer of accuracy improvement by examining the complete context of the turn. To enable this component, set format_turns to True.

Both stages work together to maximize recognition accuracy for your keyterms throughout the streaming process.

Dynamic keyterms prompting

Dynamic keyterms prompting allows you to update keyterms during an active streaming session using the UpdateConfiguration message. This enables you to adapt the recognition context in real-time based on conversation flow or changing requirements.

Updating keyterms during a session

To update keyterms while streaming, send an UpdateConfiguration message with a new keyterms_prompt array:

Python SDK

Python

JavaScript SDK

Javascript

1 # Replace or establish new set of keyterms
2 client.update_configuration(keyterms_prompt=["Universal-3"])
3 
4 # Remove keyterms and reset context biasing
5 client.update_configuration(keyterms_prompt=[])

How dynamic keyterms work

When you send an UpdateConfiguration message:

Replacing keyterms: Providing a new array of keyterms completely replaces the existing set. The new keyterms take effect immediately for subsequent audio processing.
Clearing keyterms: Sending an empty array [] removes all keyterms and resets context biasing to the default state.
Both boosting stages: Dynamic keyterms work with both word-level boosting (native context biasing) and turn-level boosting (metaphone-based), just like initial keyterms.

Use cases for dynamic keyterms

Dynamic keyterms are particularly useful for:

Context-aware voice agents: Update keyterms based on conversation stage (e.g., switching from menu items to payment terms)
Multi-topic conversations: Adapt vocabulary as the conversation topic changes
Progressive disclosure: Add relevant keyterms as new information becomes available
Cleanup: Remove keyterms that are no longer relevant to reduce processing overhead

Important notes

Keyterms prompts longer than 50 characters are ignored.
Requests containing more than 100 keyterms will result in an error.

Best practices

To maximize the effectiveness of keyterms prompting:

Specify Unique Terminology: Include proper names, company names, technical terms, or vocabulary specific to your domain that might not be commonly recognized.
Exact Spelling and Capitalization: Provide keyterms with the precise spelling and capitalization you expect to see in the output transcript. This helps the system accurately identify the terms.
Avoid Common Words: Do not include single, common English words (e.g., “information”) as keyterms. The system is generally proficient with such words, and adding them as keyterms can be redundant.