LeMUR

Summarize your audio data

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
4
5transcriber = aai.Transcriber()
6
7# You can use a local filepath:
8# audio_file = "./example.mp3"
9
10# Or use a publicly-accessible URL:
11audio_file = (
12 "https://assembly.ai/sports_injuries.mp3"
13)
14transcript = transcriber.transcribe(audio_file)
15
16prompt = "Provide a brief summary of the transcript."
17
18result = transcript.lemur.task(
19 prompt, final_model=aai.LemurModel.claude3_5_sonnet
20)
21
22print(result.response)

If you run the code above, you’ll see the following output:

1The transcript describes several common sports injuries - runner's knee,
2sprained ankle, meniscus tear, rotator cuff tear, and ACL tear. It provides
3definitions, causes, and symptoms for each injury. The transcript seems to be
4narrating sports footage and describing injuries as they occur to the athletes.
5Overall, it provides an overview of these common sports injuries that can result
6from overuse or sudden trauma during athletic activities

Ask questions about your audio data

Q&A with the task endpoint

To ask question about your audio data, define a prompt with your questions and call transcript.lemur.task(). The underlying transcript is automatically used as additional context for the model.

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
4
5# Step 1: Transcribe an audio file.
6# audio_file = "./local_file.mp3"
7audio_file = "https://assembly.ai/sports_injuries.mp3"
8
9transcriber = aai.Transcriber()
10transcript = transcriber.transcribe(audio_file)
11
12# Step 2: Define a prompt with your question(s).
13prompt = "What is a runner's knee?"
14
15# Step 3: Apply LeMUR.
16result = transcript.lemur.task(
17 prompt, final_model=aai.LemurModel.claude3_5_sonnet
18)
19
20print(result.response)

#Example output

1Based on the transcript, runner's knee is a condition characterized
2by pain behind or around the kneecap. It is caused by overuse,
3muscle imbalance and inadequate stretching. Symptoms include pain
4under or around the kneecap and pain when walking.

Q&A with the question-answer endpoint

The LeMUR Question & Answer function requires no prompt engineering and facilitates more deterministic and structured outputs. See the code examples below for more information on how to use this endpoint.

To use it, define a list of aai.LemurQuestion objects. For each question, you can define additional context and specify either a answer_format or a list of answer_options. Additionally, you can define an overall context.

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
4
5audio_url = "https://assembly.ai/meeting.mp4"
6transcript = aai.Transcriber().transcribe(audio_url)
7
8questions = [
9 aai.LemurQuestion(
10 question="What are the top level KPIs for engineering?",
11 context="KPI stands for key performance indicator",
12 answer_format="short sentence"),
13 aai.LemurQuestion(
14 question="How many days has it been since the data team has gotten updated metrics?",
15 answer_options=["1", "2", "3", "4", "5", "6", "7", "more than 7"]),
16]
17
18result = transcript.lemur.question(
19 final_model=aai.LemurModel.claude3_5_sonnet,
20 questions,
21 context="A GitLab meeting to discuss logistics"
22)
23
24for qa_response in result.response:
25 print(f"Question: {qa_response.question}")
26 print(f"Answer: {qa_response.answer}")

For the full API reference, as well as the supported models and FAQs, refer to the full LeMUR Q&A guide.

Change the model type

LeMUR features the following LLMs:

  • Claude 3.5 Sonnet
  • Claude 3 Opus
  • Claude 3 Haiku
  • Claude 3 Sonnet

You can switch the model by specifying the final_model parameter.

1result = transcript.lemur.task(
2 prompt,
3 final_model=aai.LemurModel.claude3_5_sonnet
4)
ModelSDK ParameterDescription
Claude 3.5 Sonnetaai.LemurModel.claude3_5_sonnetClaude 3.5 Sonnet is the most intelligent model to date, outperforming Claude 3 Opus on a wide range of evaluations, with the speed and cost of Claude 3 Sonnet. This uses Anthropic’s Claude 3.5 Sonnet model version claude-3-5-sonnet-20240620.
Claude 3.0 Opusaai.LemurModel.claude3_opusClaude 3 Opus is good at handling complex analysis, longer tasks with many steps, and higher-order math and coding tasks.
Claude 3.0 Haikuaai.LemurModel.claude3_haikuClaude 3 Haiku is the fastest model that can execute lightweight actions.
Claude 3.0 Sonnetaai.LemurModel.claude3_sonnetClaude 3 Sonnet is a legacy model with a balanced combination of performance and speed for efficient, high-throughput tasks.

You can find more information on pricing for each model here.

Change the maximum output size

You can change the maximum output size in tokens by specifying the max_output_size parameter. Up to 4000 tokens are allowed.

1result = transcript.lemur.task(
2 prompt,
3 max_output_size=1000
4)

Change the temperature

You can change the temperature by specifying the temperature parameter, ranging from 0.0 to 1.0.

Higher values result in answers that are more creative, lower values are more conservative.

1result = transcript.lemur.task(
2 prompt,
3 temperature=0.7
4)

Send customized input

You can submit custom text inputs to LeMUR without transcript IDs. This allows you to customize the input, for example, you could include the speaker labels for the LLM.

To submit custom text input, use the input_text parameter on aai.Lemur().task().

1config = aai.TranscriptionConfig(
2 speaker_labels=True,
3)
4transcript = transcriber.transcribe(audio_url, config=config)
5
6text_with_speaker_labels = ""
7for utt in transcript.utterances:
8 text_with_speaker_labels += f"Speaker {utt.speaker}:\n{utt.text}\n"
9
10result = aai.Lemur().task(
11 prompt,
12 input_text=text_with_speaker_labels
13)

Submit multiple transcripts

LeMUR can easily ingest multiple transcripts in a single API call.

You can feed in up to a maximum of 100 files or 100 hours, whichever is lower.

1transcript_group = transcriber.transcribe_group(
2 [
3 "https://example.org/customer1.mp3",
4 "https://example.org/customer2.mp3",
5 "https://example.org/customer3.mp3",
6 ],
7)
8
9# Or use existing transcripts:
10# transcript_group = aai.TranscriptGroup.get_by_ids([id1, id2, id3])
11
12result = transcript_group.lemur.task(
13 prompt="Provide a summary of these customer calls."
14)

Delete LeMUR request data

You can delete the data for a previously submitted LeMUR request.

Response data from the LLM, as well as any context provided in the original request will be removed.

1result = transcript.lemur.task(prompt)
2
3deletion_response = aai.Lemur.purge_request_data(result.request_id)