Process speaker labels with LeMUR

In this guide, you’ll learn how to use AssemblyAI’s API to transcribe audio, identify speakers, and infer their names using LeMUR. We’ll walk through the process of configuring the transcriber, submitting a transcript to LeMUR with speaker labels, and generating a mapping of speaker names from the transcript.

This workflow will enable you to have speaker labels with the speaker’s name in your transcripts:

1Before:
2Speaker A: G'day, bud.
3Speaker B: How are you? Very good.
4
5After:
6Ben: G'day, bud.
7Bryce: How are you? Very good.

Before you begin

To complete this tutorial, you need:

For the entire source code of this guide, see Speaker Identification.

Step-by-step instructions

Install the Python SDK:

$pip install assemblyai
1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"

Define a Transcriber, a TranscriptionConfig with speaker_labels set to True. Then, create a transcript.

1transcriber = aai.Transcriber()
2config = aai.TranscriptionConfig(speaker_labels=True)
3audio_url = "https://www.listennotes.com/e/p/accd617c94a24787b2e0800f264b7a5e/"
4transcript = transcriber.transcribe(audio_url, config)

Process the transcript with speaker labels:

1text_with_speaker_labels = ""
2for utt in transcript.utterances:
3 text_with_speaker_labels += f"Speaker {utt.speaker}:\n{utt.text}\n"

Count the unique speakers, then create a LemurQuestion for each speaker. Lastly, ask LeMUR the questions, specifying text_with_speaker_labels as the input_text.

1unique_speakers = set(utterance.speaker for utterance in transcript.utterances)
2
3questions = []
4for speaker in unique_speakers:
5 questions.append(
6 aai.LemurQuestion(
7 question=f"Who is speaker {speaker}?",
8 answer_format="<First Name> <Last Name (if applicable)>"
9 )
10 )
11
12result = aai.Lemur().question(
13 questions,
14 input_text=text_with_speaker_labels,
15 context="Your task is to infer the speaker's name from the speaker-labelled transcript"
16)

Map the speaker alphabets to their names from LeMUR:

1speaker_mapping = {}
2for qa_response in result.response:
3 pattern = r"Who is speaker (\w)\?"
4 match = re.search(pattern, qa_response.question)
5 if match and match.group(1) not in speaker_mapping.keys():
6 speaker_mapping.update({match.group(1): qa_response.answer})

Print the transcript with Speaker names:

1for utterance in transcript.utterances:
2 speaker_name = speaker_mapping[utterance.speaker]
3 print(f"{speaker_name}: {utterance.text}")

Output:

1Ben Kingsley: G'day, folks. Ben Kingsley here in this throwback Tuesday bonus episode, ...
2Bryce: All right, folks, you're on the property couch, where each week, Ben and I give you the insider's guide to property investing. Hi, mate.
3Ben Kingsley: G'day, bud.
4Bryce: How are you? Very good. Hey, we should do a little sound check here, Ben...
Was this page helpful?
Built with