In this tutorial, we'll learn how to utilize LLMs to extract call insights in just a few lines of code. In particular, we'll learn how to extract a summary, action items, and contact information from the below sample customer call:
The following information will be returned by the LLM:
SUMMARY:
- The caller is interested in getting an estimate for building a house on a property he is looking to purchase in Westchester.
ACTION ITEMS:
- Have someone call the customer back today to discuss building estimate.
- Set up time for builder to assess potential property site prior to purchase.
CONTACT INFORMATION:
Name: Lindstrom, Kenny
Phone number: 610-265-1715
Getting Started
To follow along with this tutorial, you’ll need to have Python installed and an AssemblyAI API key.
All of the code in this tutorial is available in the project repository on GitHub.
Setting Up Your Environment
First, create a directory for the project and navigate into it. Then, create a virtual environment and activate it:
# Mac/Linux
python3 -m venv venv
. venv/bin/activate
# Windows
python -m venv venv
.\venv\Scripts\activate.bat
Now, install the AssemblyAI Python SDK:
pip install assemblyai
Finally, set your API key as an environment variable with the following command, where you replace <YOUR_KEY>
with your AssemblyAI API key copied from your dashboard.
# Mac/Linux
export ASSEMBLYAI_API_KEY=<YOUR_KEY>
# Windows
set ASSEMBLYAI_API_KEY=<YOUR_KEY>
Transcribing the Call
Now we're ready to start writing our application code. We’ll first need to transcribe the phone call so that it can be processed by an LLM. We can do this in just a few lines of code using AssemblyAI’s Python SDK. Paste the following code in a file called main.py
.
import assemblyai as aai
# Create a Transcriber object
transcriber = aai.Transcriber()
# Transcribe the audio file
transcript = transcriber.transcribe("https://storage.googleapis.com/aai-web-samples/Custom-Home-Builder.mp3")
# Print the transcribed text
print(f"TRANSCRIPT:\n{transcript.text}\n")
This code creates a Transcriber
object, and then calls its transcribe
method with the URL of our audio file. You can use either a remote file or a local file, but, if the file is remote, ensure that it is publicly accessible. Finally, we print the transcript text.
Extracting insights with LeMUR
Now that we have the transcript, we can easily extract insights using LeMUR, AssemblyAI's framework for building LLM applications on audio data. We'll start by defining what we want the LLM to do.
Defining the task
In this example, we’ll provide LeMUR with a prompt that will extract our desired insights. While we could simply just ask LeMUR to provide a summary directly, we’ll instead take advantage of some prompting best practices to ensure good results.
We specify the role of the LLM to give it a description of how it should behave/respond, and we provide context to the model so it understands the environment in which it is working (just as an employee would). Then, we provide an instruction to the LLM in order to actually give it the task we want accomplished, and finally we provide an answer format to help us prepare our outputs in a standardized way.
Add the following code to main.py
:
prompt = """
ROLE:
You are a customer service professional. You are very competent and able to extract meaningful insights from transcripts of customer calls that are submitted to you.
CONTEXT:
This is a call from someone who is inquiring at a home building company
INSTRUCTION:
Respond to the following command: "Provide a short summary of the phone call, and list any outstanding action items after the summary. Finally, provide the caller's contact information. Do not include a preamble."
FORMAT:
SUMMARY:
a one or two sentence summary
ACTION ITEMS:
a bulleted list of sufficiently detailed action items
CONTACT INFORMATION:
Name: Last name, first name
Phone number: The caller's phone number
""".strip()
Extracting call insights with LeMUR
Now that we have our prompt, all we have to do is pass it to LeMUR to get a response. The AssemblyAI Python SDK makes this easy - we just call the .lemur.task
method of our transcript
and pass in the prompt.
result = transcript.lemur.task(prompt)
We access the response text via the response
attribute, and strip off any extraneous whitespace characters with strip()
.
print(result.response.strip())
Add the above two lines of code to main.py
, and then run it with python main.py
from the terminal in which you set your API key. After a few moments, we are presented with the phone call insights extracted by the LLM:
SUMMARY:
- The caller is interested in getting an estimate for building a house on a property he is looking to purchase in Westchester.
ACTION ITEMS:
- Have someone call the customer back today to discuss building estimate.
- Set up time for builder to assess potential property site prior to purchase.
CONTACT INFORMATION:
Name: Lindstrom, Kenny
Phone number: 610-265-1715
If you check the call transcript (or listen to the original call), you will see that all of the information is indeed accurate.
Use cases for extracting call insights
Extracting insights from phone calls has many use cases across many industries. Just a few of them are:
- lead intelligence
- conversational data analysis
- sales coaching
To learn more about potential use cases for extracting call insights, you can check out this blog.
Final words
In this blog, we learned how to extract insights from customer calls using AssemblyAI's Python SDK. By transcribing customer calls and leveraging LeMUR, you can gain valuable information to enhance your services and meet customer needs more effectively.
If you want to learn more about how to analyze audio data with LLMs, you can check out some of our other resources like
- Retrieval Augmented Generation on audio data with LangChain and Chroma
- Ask podcast questions with Semantic Kernel, GPT, and Chroma DB
- How to build an interactive lecture summarization app
Alternatively, check out the other content on our Blog or YouTube channel for additional learning resources on AI.