Custom Topic Tags | AssemblyAI

In this guide we will show you how to label content with custom topic tags using AssemblyAI’s LLM, LeMUR.

Quickstart

1 import assemblyai as aai
2 
3 aai.settings.api_key = "API_KEY"
4 audio_url = "YOUR_AUDIO_URL"
5 
6 transcript = aai.Transcriber().transcribe(audio_url)
7 
8 tag_list = {
9     'Sports': 'News and updates on various athletic events, teams, and sports personalities.',
10     'Politics': 'Coverage and discussion of government activities, policies, and political events.',
11     'Entertainment': 'Information on movies, music, television, celebrities, and arts.',
12     'Technology': 'News and reviews on gadgets, software, tech advancements, and trends.',
13     'Health': 'Articles focusing on medical news, wellness, and health-related topics.',
14     'Business': 'Updates on markets, industries, companies, and economic trends.',
15     'Science': 'News and insights into scientific discoveries, research, and innovations.',
16     'Education': 'Coverage of topics related to schools, educational policies, and learning.',
17     'Travel': 'Information on destinations, travel tips, and tourism news.',
18     'Lifestyle': 'Articles on fashion, hobbies, personal interests, and daily life.',
19     'Environment': 'News and discussion about environmental issues and sustainability.',
20     'Finance': 'Information on personal finance, investments, banking, and economic news.',
21     'World News': 'International news covering global events and issues.',
22     'Crime': 'Reports and updates on criminal activities, law enforcement, and legal cases.',
23     'Culture': 'Coverage of cultural events, traditions, and societal norms.'
24 }
25 
26 predicted_tag = transcript.lemur.task(
27     prompt=f"""
28 
29     You are a helpful assistant designed to label video content with topic tags.
30 
31     I will give you a list of topics and definitions. Select the most relevant topic from the list. Return your selection and nothing else.
32 
33     <topics_list>
34     {tag_list}
35     </topics_list>
36     """
37 ).response
38 
39 print(predicted_tag.strip())

Get Started

Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for an AssemblyAI account and get your API key from your dashboard. You will need to upgrade your account by adding a credit card to have access to LeMUR.

Find more details on the current LeMUR pricing in the AssemblyAI pricing page.

Step-by-Step Instructions

Install the SDK:

$ pip install assemblyai

Import the SDK and set your AssemblyAI API key.

1 import assemblyai as aai
2 
3 aai.settings.api_key = "API_KEY"

Use AssemblyAI to transcribe a file and save the transcript.

1 audio_url = "YOUR_AUDIO_URL"
2 
3 transcript = aai.Transcriber().transcribe(audio_url)

Create a tag_list of custom topics, which consists of a key that is the topic and a value that is a short description of what qualifies a file to be labeled with that topic.

Here is an example of a tag_list that can be used for videos or podcasts:

1 tag_list = {
2     'Sports': 'News and updates on various athletic events, teams, and sports personalities.',
3     'Politics': 'Coverage and discussion of government activities, policies, and political events.',
4     'Entertainment': 'Information on movies, music, television, celebrities, and arts.',
5     'Technology': 'News and reviews on gadgets, software, tech advancements, and trends.',
6     'Health': 'Articles focusing on medical news, wellness, and health-related topics.',
7     'Business': 'Updates on markets, industries, companies, and economic trends.',
8     'Science': 'News and insights into scientific discoveries, research, and innovations.',
9     'Education': 'Coverage of topics related to schools, educational policies, and learning.',
10     'Travel': 'Information on destinations, travel tips, and tourism news.',
11     'Lifestyle': 'Articles on fashion, hobbies, personal interests, and daily life.',
12     'Environment': 'News and discussion about environmental issues and sustainability.',
13     'Finance': 'Information on personal finance, investments, banking, and economic news.',
14     'World News': 'International news covering global events and issues.',
15     'Crime': 'Reports and updates on criminal activities, law enforcement, and legal cases.',
16     'Culture': 'Coverage of cultural events, traditions, and societal norms.'
17 }

Here is another example of a tag_list that can be used for support calls:

1 tag_list = {
2     'Account Issues': 'Problems related to user accounts, such as login difficulties or account access.',
3     'Technical Support': 'Inquiries regarding software or hardware functionality and troubleshooting.',
4     'Billing and Payments': 'Questions or problems about invoices, payments, or subscription plans.',
5     'Product Inquiry': 'Requests for information about product features, capabilities, or availability.',
6     'Service Disruption': 'Reports of outages or interruptions in service performance or availability.'
7 }

Prompt LeMUR using the Task Endpoint and return the response. This is an example prompt, which you can modify to suit your specific requirements. See our documentation for more information about prompt engineering.

1 predicted_tag = transcript.lemur.task(
2     prompt=f"""
3 
4     You are a helpful assistant designed to label video content with topic tags.
5 
6     I will give you a list of topics and definitions. Select the most relevant topic from the list. Return your selection and nothing else.
7 
8     <topics_list>
9     {tag_list}
10     </topics_list>
11     """
12 ).response
13 
14 print(predicted_tag.strip())