Semantic Kernel Integration for AssemblyAI

Semantic Kernel is an SDK for multiple programming languages to develop applications with Large Language Models (LLMs). However, LLMs only operate on textual data and don’t understand what is said in audio files. With the AssemblyAI integration for Semantic Kernel, you can use AssemblyAI’s transcription models using the TranscribePlugin to transcribe your audio and video files.

Quickstart

Add the AssemblyAI.SemanticKernel NuGet package to your project.

$dotnet add package AssemblyAI.SemanticKernel

Next, register the TranscriptPlugin into your kernel:

1using AssemblyAI.SemanticKernel;
2using Microsoft.SemanticKernel;
3
4// Build your kernel
5var kernel = Kernel.CreateBuilder();
6
7// Get AssemblyAI API key from env variables, or much better, from .NET configuration
8string apiKey = Environment.GetEnvironmentVariable("ASSEMBLYAI_API_KEY")
9 ?? throw new Exception("ASSEMBLYAI_API_KEY env variable not configured.");
10
11kernel.ImportPluginFromObject(
12 new TranscriptPlugin(apiKey: apiKey)
13);

Usage

Get the Transcribe function from the transcript plugin and invoke it with the context variables.

1var result = await kernel.InvokeAsync(
2 nameof(TranscriptPlugin),
3 TranscriptPlugin.TranscribeFunctionName,
4 new KernelArguments
5 {
6 ["INPUT"] = "https://assembly.ai/espn.m4a"
7 }
8);
9Console.WriteLine(result.GetValue<string>());

You can get the transcript using result.GetValue<string>().

You can also upload local audio and video file. To do this:

  • Set the TranscriptPlugin.AllowFileSystemAccess property to true.
  • Configure the INPUT variable with a local file path.
1kernel.ImportPluginFromObject(
2 new TranscriptPlugin(apiKey: apiKey)
3 {
4 AllowFileSystemAccess = true
5 }
6);
7var result = await kernel.InvokeAsync(
8 nameof(TranscriptPlugin),
9 TranscriptPlugin.TranscribeFunctionName,
10 new KernelArguments
11 {
12 ["INPUT"] = "https://assembly.ai/espn.m4a"
13 }
14);
15Console.WriteLine(result.GetValue<string>());

You can also invoke the function from within a semantic function like this.

1string prompt = """
2 Here is a transcript:
3 {{TranscriptPlugin.Transcribe "https://assembly.ai/espn.m4a"}}
4 ---
5 Summarize the transcript.
6 """;
7var result = await kernel.InvokePromptAsync(prompt);
8Console.WriteLine(result.GetValue<string>());

Additional resources

You can learn more about using Semantic Kernel with AssemblyAI in these resources:

Was this page helpful?
Built with