Keyterms Prompting

Keyterms prompting allows you to provide up to 1,000 words or phrases (maximum 6 words per phrase) using the keyterms_prompt parameter to improve transcription accuracy for those terms and related variations or contextually similar phrases.

Here is an example showing how you can use keyterms prompting to improve transcription accuracy for a name with distinctive spelling and formatting.

Without keyterms prompting:

Hi, this is Kelly Byrne Donahue

With keyterms prompting:

Hi, this is Kelly Byrne-Donoghue
1import requests
2import time
3
4base_url = "https://api.assemblyai.com"
5headers = {"authorization": "<YOUR_API_KEY>"}
6
7data = {
8 "audio_url": "https://assemblyaiassets.com/audios/keyterms_prompting.wav",
9 "language_detection": True,
10 "speech_models": ["universal-3-pro", "universal-2"],
11 "keyterms_prompt": ["Kelly Byrne-Donoghue"]
12}
13
14response = requests.post(base_url + "/v2/transcript", headers=headers, json=data)
15
16if response.status_code != 200:
17 print(f"Error: {response.status_code}, Response: {response.text}")
18 response.raise_for_status()
19
20transcript_response = response.json()
21transcript_id = transcript_response["id"]
22polling_endpoint = f"{base_url}/v2/transcript/{transcript_id}"
23
24while True:
25 transcript = requests.get(polling_endpoint, headers=headers).json()
26 if transcript["status"] == "completed":
27 print(transcript["text"])
28 break
29 elif transcript["status"] == "error":
30 raise RuntimeError(f"Transcription failed: {transcript['error']}")
31 else:
32 time.sleep(3)
Keyword count limits

While we support up to 1000 key words and phrases, actual capacity may be lower due to internal tokenization and implementation constraints. Key points to remember:

  • Each word in a multi-word phrase counts towards the 1000 keyword limit
  • Capitalization affects capacity (uppercase tokens consume more than lowercase)
  • Longer words consume more capacity than shorter words

For optimal results, use shorter phrases when possible and be mindful of your total token count when approaching the keyword limit.

Using Universal-2 (Beta)

keyterms_prompt for Universal-2 is currently available at no additional cost while we gather feedback and refine functionality. Pricing may be introduced as the feature moves out of beta. We’ll notify all users well in advance of any pricing changes.

As we continue to develop this feature, functionality may evolve. For the latest updates and code examples, please check back on this page.

If you’re currently using our universal-2 model, the keyterms_prompt parameter is in Beta for English files.

The maximum number of keyterms with Universal-2 is 200. Keyterms shorter than 5 characters or longer than 50 characters are ignored.

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
4
5audio_file = "https://assembly.ai/sports_injuries.mp3"
6
7config = aai.TranscriptionConfig(
8 speech_models="universal-2",
9 language_detection=True,
10 keyterms_prompt=['differential diagnosis', 'hypertension', 'Wellbutrin XL 150mg']
11)
12
13transcript = aai.Transcriber(config=config).transcribe(audio_file)
14
15print(transcript.text)