For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
PlaygroundChangelogSign In
OverviewAPI ReferencePre-recorded STTStreaming STTVoice AgentsSpeech UnderstandingGuardrailsLLM GatewayFAQ
OverviewAPI ReferencePre-recorded STTStreaming STTVoice AgentsSpeech UnderstandingGuardrailsLLM GatewayFAQ
  • Getting started
    • Overview
    • Apply LLM Gateway to pre-recorded audio
    • Apply LLM Gateway to streaming audio
    • Specify fallback models
    • Prompt caching
    • Post-processing
    • Cloud endpoints & data residency
    • Troubleshooting
  • Use cases
    • Ask questions about your audio data
    • Build agentic workflows
    • Basic chat completions
    • Multi-turn conversations
    • Use tool calling with LLMs
    • Get structured JSON outputs
  • Guides
      • Analyze The Sentiment Of A Customer Call using LLM Gateway
      • Custom Topic Tags
      • Redact PII from Text Using LLM Gateway
LogoLogo
PlaygroundChangelogSign In
On this page
  • Quickstart
  • Get Started
  • Step-by-Step Instructions
  • Install dependencies
GuidesSubstitute Speech Understanding with LLM Gateway

Custom Topic Tags Using LLM Gateway

Was this page helpful?
Previous

Redact PII from Text Using LLM Gateway

Next
Built with

In this guide we will show you how to label content with custom topic tags using AssemblyAI’s LLM Gateway.

Quickstart

Python
JavaScript
1import requests
2import time
3
4base_url = "https://api.assemblyai.com"
5headers = {"authorization": "<YOUR_API_KEY>"}
6
7# Step 1: Transcribe the audio
8audio_url = "https://storage.googleapis.com/aai-web-samples/meeting.mp4"
9data = {"audio_url": audio_url, "speech_models": ["universal-3-pro"]}
10
11response = requests.post(base_url + "/v2/transcript", json=data, headers=headers)
12transcript_id = response.json()['id']
13polling_endpoint = base_url + "/v2/transcript/" + transcript_id
14
15while True:
16 transcription_result = requests.get(polling_endpoint, headers=headers).json()
17 if transcription_result['status'] == 'completed':
18 break
19 elif transcription_result['status'] == 'error':
20 raise RuntimeError(f"Transcription failed: {transcription_result['error']}")
21 else:
22 time.sleep(3)
23
24# Step 2: Generate topic tags with LLM Gateway
25tag_list = {
26 'Sports': 'News and updates on various athletic events, teams, and sports personalities.',
27 'Politics': 'Coverage and discussion of government activities, policies, and political events.',
28 'Entertainment': 'Information on movies, music, television, celebrities, and arts.',
29 'Technology': 'News and reviews on gadgets, software, tech advancements, and trends.',
30 'Health': 'Articles focusing on medical news, wellness, and health-related topics.',
31 'Business': 'Updates on markets, industries, companies, and economic trends.',
32 'Science': 'News and insights into scientific discoveries, research, and innovations.',
33 'Education': 'Coverage of topics related to schools, educational policies, and learning.',
34 'Travel': 'Information on destinations, travel tips, and tourism news.',
35 'Lifestyle': 'Articles on fashion, hobbies, personal interests, and daily life.',
36 'Environment': 'News and discussion about environmental issues and sustainability.',
37 'Finance': 'Information on personal finance, investments, banking, and economic news.',
38 'World News': 'International news covering global events and issues.',
39 'Crime': 'Reports and updates on criminal activities, law enforcement, and legal cases.',
40 'Culture': 'Coverage of cultural events, traditions, and societal norms.'
41}
42
43prompt = f"""
44You are a helpful assistant designed to label video content with topic tags.
45
46I will give you a list of topics and definitions. Select the most relevant topic from the list. Return your selection and nothing else.
47
48<topics_list>
49{tag_list}
50</topics_list>
51"""
52
53llm_gateway_data = {
54 "model": "claude-sonnet-4-5-20250929",
55 "messages": [
56 {"role": "user", "content": f"{prompt}\n\n{{{{ transcript }}}}"}
57 ],
58 "transcript_id": transcript_id,
59 "max_tokens": 500
60}
61
62response = requests.post(
63 "https://llm-gateway.assemblyai.com/v1/chat/completions",
64 headers=headers,
65 json=llm_gateway_data
66)
67
68result = response.json()["choices"][0]["message"]["content"]
69print(result.strip())

Get Started

Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for an AssemblyAI account and get your API key from your dashboard.

Step-by-Step Instructions

Install dependencies

Install the required packages:

Python
$pip install requests

Set up your API client and transcribe the audio file:

Python
JavaScript
1import requests
2import time
3
4base_url = "https://api.assemblyai.com"
5headers = {"authorization": "<YOUR_API_KEY>"}
6
7# Transcribe the audio
8audio_url = "https://storage.googleapis.com/aai-web-samples/meeting.mp4"
9data = {"audio_url": audio_url, "speech_models": ["universal-3-pro"]} # You can also use a URL to an audio or video file on the web
10
11response = requests.post(base_url + "/v2/transcript", json=data, headers=headers)
12transcript_id = response.json()['id']
13polling_endpoint = base_url + "/v2/transcript/" + transcript_id
14
15while True:
16 transcription_result = requests.get(polling_endpoint, headers=headers).json()
17 if transcription_result['status'] == 'completed':
18 break
19 elif transcription_result['status'] == 'error':
20 raise RuntimeError(f"Transcription failed: {transcription_result['error']}")
21 else:
22 time.sleep(3)
Python
JavaScript

Create a tag_list of custom topics, which consists of a key that is the topic and a value that is a short description of what qualifies a file to be labeled with that topic.

Here is an example of a tag_list that can be used for videos or podcasts:

1tag_list = {
2 'Sports': 'News and updates on various athletic events, teams, and sports personalities.',
3 'Politics': 'Coverage and discussion of government activities, policies, and political events.',
4 'Entertainment': 'Information on movies, music, television, celebrities, and arts.',
5 'Technology': 'News and reviews on gadgets, software, tech advancements, and trends.',
6 'Health': 'Articles focusing on medical news, wellness, and health-related topics.',
7 'Business': 'Updates on markets, industries, companies, and economic trends.',
8 'Science': 'News and insights into scientific discoveries, research, and innovations.',
9 'Education': 'Coverage of topics related to schools, educational policies, and learning.',
10 'Travel': 'Information on destinations, travel tips, and tourism news.',
11 'Lifestyle': 'Articles on fashion, hobbies, personal interests, and daily life.',
12 'Environment': 'News and discussion about environmental issues and sustainability.',
13 'Finance': 'Information on personal finance, investments, banking, and economic news.',
14 'World News': 'International news covering global events and issues.',
15 'Crime': 'Reports and updates on criminal activities, law enforcement, and legal cases.',
16 'Culture': 'Coverage of cultural events, traditions, and societal norms.'
17}
Python
JavaScript

Here is another example of a tag_list that can be used for support calls:

1tag_list = {
2 'Account Issues': 'Problems related to user accounts, such as login difficulties or account access.',
3 'Technical Support': 'Inquiries regarding software or hardware functionality and troubleshooting.',
4 'Billing and Payments': 'Questions or problems about invoices, payments, or subscription plans.',
5 'Product Inquiry': 'Requests for information about product features, capabilities, or availability.',
6 'Service Disruption': 'Reports of outages or interruptions in service performance or availability.'
7}

Use LLM Gateway to analyze the transcript and select the most relevant topic tag. This is an example prompt, which you can modify to suit your specific requirements.

Python
JavaScript
1prompt = f"""
2You are a helpful assistant designed to label video content with topic tags.
3
4I will give you a list of topics and definitions. Select the most relevant topic from the list. Return your selection and nothing else.
5
6<topics_list>
7{tag_list}
8</topics_list>
9"""
10
11llm_gateway_data = {
12 "model": "claude-sonnet-4-5-20250929",
13 "messages": [
14 {"role": "user", "content": f"{prompt}\n\n{{{{ transcript }}}}"}
15 ],
16 "transcript_id": transcript_id,
17 "max_tokens": 500
18}
19
20response = requests.post(
21 "https://llm-gateway.assemblyai.com/v1/chat/completions",
22 headers=headers,
23 json=llm_gateway_data
24)
25
26result = response.json()["choices"][0]["message"]["content"]
27print(result.strip())