Custom Topic Tags Using LLM Gateway
In this guide we will show you how to label content with custom topic tags using AssemblyAI’s LLM Gateway.
Quickstart
1 import requests 2 import time 3 4 base_url = "https://api.assemblyai.com" 5 headers = {"authorization": "<YOUR_API_KEY>"} 6 7 # Step 1: Transcribe the audio 8 audio_url = "https://storage.googleapis.com/aai-web-samples/meeting.mp4" 9 data = {"audio_url": audio_url} 10 11 response = requests.post(base_url + "/v2/transcript", json=data, headers=headers) 12 transcript_id = response.json()['id'] 13 polling_endpoint = base_url + "/v2/transcript/" + transcript_id 14 15 while True: 16 transcription_result = requests.get(polling_endpoint, headers=headers).json() 17 if transcription_result['status'] == 'completed': 18 break 19 elif transcription_result['status'] == 'error': 20 raise RuntimeError(f"Transcription failed: {transcription_result['error']}") 21 else: 22 time.sleep(3) 23 24 # Step 2: Generate topic tags with LLM Gateway 25 tag_list = { 26 'Sports': 'News and updates on various athletic events, teams, and sports personalities.', 27 'Politics': 'Coverage and discussion of government activities, policies, and political events.', 28 'Entertainment': 'Information on movies, music, television, celebrities, and arts.', 29 'Technology': 'News and reviews on gadgets, software, tech advancements, and trends.', 30 'Health': 'Articles focusing on medical news, wellness, and health-related topics.', 31 'Business': 'Updates on markets, industries, companies, and economic trends.', 32 'Science': 'News and insights into scientific discoveries, research, and innovations.', 33 'Education': 'Coverage of topics related to schools, educational policies, and learning.', 34 'Travel': 'Information on destinations, travel tips, and tourism news.', 35 'Lifestyle': 'Articles on fashion, hobbies, personal interests, and daily life.', 36 'Environment': 'News and discussion about environmental issues and sustainability.', 37 'Finance': 'Information on personal finance, investments, banking, and economic news.', 38 'World News': 'International news covering global events and issues.', 39 'Crime': 'Reports and updates on criminal activities, law enforcement, and legal cases.', 40 'Culture': 'Coverage of cultural events, traditions, and societal norms.' 41 } 42 43 prompt = f""" 44 You are a helpful assistant designed to label video content with topic tags. 45 46 I will give you a list of topics and definitions. Select the most relevant topic from the list. Return your selection and nothing else. 47 48 <topics_list> 49 {tag_list} 50 </topics_list> 51 """ 52 53 llm_gateway_data = { 54 "model": "claude-sonnet-4-5-20250929", 55 "messages": [ 56 {"role": "user", "content": f"{prompt}\n\nTranscript: {transcription_result['text']}"} 57 ], 58 "max_tokens": 500 59 } 60 61 response = requests.post( 62 "https://llm-gateway.assemblyai.com/v1/chat/completions", 63 headers=headers, 64 json=llm_gateway_data 65 ) 66 67 result = response.json()["choices"][0]["message"]["content"] 68 print(result.strip())
Get Started
Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for an AssemblyAI account and get your API key from your dashboard.
Step-by-Step Instructions
Install the required packages:
$ pip install requests
Set up your API client and transcribe the audio file:
1 import requests 2 import time 3 4 base_url = "https://api.assemblyai.com" 5 headers = {"authorization": "<YOUR_API_KEY>"} 6 7 # Transcribe the audio 8 audio_url = "https://storage.googleapis.com/aai-web-samples/meeting.mp4" 9 data = {"audio_url": audio_url} # You can also use a URL to an audio or video file on the web 10 11 response = requests.post(base_url + "/v2/transcript", json=data, headers=headers) 12 transcript_id = response.json()['id'] 13 polling_endpoint = base_url + "/v2/transcript/" + transcript_id 14 15 while True: 16 transcription_result = requests.get(polling_endpoint, headers=headers).json() 17 if transcription_result['status'] == 'completed': 18 break 19 elif transcription_result['status'] == 'error': 20 raise RuntimeError(f"Transcription failed: {transcription_result['error']}") 21 else: 22 time.sleep(3)
Create a tag_list of custom topics, which consists of a key that is the topic and a value that is a short description of what qualifies a file to be labeled with that topic.
Here is an example of a tag_list that can be used for videos or podcasts:
1 tag_list = { 2 'Sports': 'News and updates on various athletic events, teams, and sports personalities.', 3 'Politics': 'Coverage and discussion of government activities, policies, and political events.', 4 'Entertainment': 'Information on movies, music, television, celebrities, and arts.', 5 'Technology': 'News and reviews on gadgets, software, tech advancements, and trends.', 6 'Health': 'Articles focusing on medical news, wellness, and health-related topics.', 7 'Business': 'Updates on markets, industries, companies, and economic trends.', 8 'Science': 'News and insights into scientific discoveries, research, and innovations.', 9 'Education': 'Coverage of topics related to schools, educational policies, and learning.', 10 'Travel': 'Information on destinations, travel tips, and tourism news.', 11 'Lifestyle': 'Articles on fashion, hobbies, personal interests, and daily life.', 12 'Environment': 'News and discussion about environmental issues and sustainability.', 13 'Finance': 'Information on personal finance, investments, banking, and economic news.', 14 'World News': 'International news covering global events and issues.', 15 'Crime': 'Reports and updates on criminal activities, law enforcement, and legal cases.', 16 'Culture': 'Coverage of cultural events, traditions, and societal norms.' 17 }
Here is another example of a tag_list that can be used for support calls:
1 tag_list = { 2 'Account Issues': 'Problems related to user accounts, such as login difficulties or account access.', 3 'Technical Support': 'Inquiries regarding software or hardware functionality and troubleshooting.', 4 'Billing and Payments': 'Questions or problems about invoices, payments, or subscription plans.', 5 'Product Inquiry': 'Requests for information about product features, capabilities, or availability.', 6 'Service Disruption': 'Reports of outages or interruptions in service performance or availability.' 7 }
Use LLM Gateway to analyze the transcript and select the most relevant topic tag. This is an example prompt, which you can modify to suit your specific requirements. See our documentation for more information about prompt engineering.
1 prompt = f""" 2 You are a helpful assistant designed to label video content with topic tags. 3 4 I will give you a list of topics and definitions. Select the most relevant topic from the list. Return your selection and nothing else. 5 6 <topics_list> 7 {tag_list} 8 </topics_list> 9 """ 10 11 llm_gateway_data = { 12 "model": "claude-sonnet-4-5-20250929", 13 "messages": [ 14 {"role": "user", "content": f"{prompt}\n\nTranscript: {transcription_result['text']}"} 15 ], 16 "max_tokens": 500 17 } 18 19 response = requests.post( 20 "https://llm-gateway.assemblyai.com/v1/chat/completions", 21 headers=headers, 22 json=llm_gateway_data 23 ) 24 25 result = response.json()["choices"][0]["message"]["content"] 26 print(result.strip())