Identifying highlights in audio and video files

The Key Phrases model identifies significant words and phrases in your transcript and lets you to extract the most important concepts or highlights from your audio or video file.

For example, if you’re a call center, you can analyze highlights from recorded phone calls.

In this step-by-step guide, you’ll learn how to apply the model. You’ll send the auto_highlights parameter in your request, and then use the auto_highlights_result property in the response.

Get started

Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for a free account and get your API key from your dashboard.

The complete source code for this guide can be viewed here.

Here’s an audio sample for this guide:

$https://assembly.ai/wildfires.mp3

Step-by-step instructions

1

Install the SDK.

1pip install -U assemblyai
2

Import the assemblyai package and set the API key.

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
3

Create a TranscriptionConfig with auto_highlights set to True.

1# highlight-next-line
2config = aai.TranscriptionConfig(auto_highlights=True)
4

Create a Transcriber object and pass in the configuration.

1transcriber = aai.Transcriber(config=config)
5

Pass the URL or file path to Transcriber.transcribe(). You can access the transcript from the returned Transcript object.

1FILE_URL = "https://assembly.ai/wildfires.mp3"
2
3transcript = transcriber.transcribe(FILE_URL)
6

You can access automatic highlights from transcript.auto_highlights.results.

1for result in transcript.auto_highlights.results:
2 print(f"Highlight: {result.text}, Count: {result.count}, Rank: {result.rank}, Timestamps: {result.timestamps}")

Understanding the response

The auto_highlights_result key in the response contains a list of all the highlights found in the transcription text. For each entry, the results include the text of the phrase or word detected (text), how many times it occurred in the text (count), its relevancy score (rank), and a list of all the timestamps (timestamps), in milliseconds, in the audio where the phrase or word is spoken.

For more information about the API response, see API/Model reference.

Conclusion

Automatically highlighting relevant phrases in calls is a great way to focus on important information at a glance. In general, adding AI to Conversation Intelligence tools can augment them by generating actionable summaries to speed up call review, generating insights, monitoring for concerns, increasing engagement, and more. Our AI summarization model has several customizable parameters that you can experiment with for other types of recordings.

To learn more about how to use AI summarization for call coaching, see AssemblyAI blog.

Was this page helpful?
Built with