Semantic Kernel Integration for AssemblyAI | AssemblyAI

Semantic Kernel is an SDK for multiple programming languages to develop applications with Large Language Models (LLMs). However, LLMs only operate on textual data and don’t understand what is said in audio files. With the AssemblyAI integration for Semantic Kernel, you can use AssemblyAI’s transcription models using the TranscribePlugin to transcribe your audio and video files.

Quickstart

Add the AssemblyAI.SemanticKernel NuGet package to your project.

dotnet CLI

Package Manager Console

$ dotnet add package AssemblyAI.SemanticKernel

Next, register the TranscriptPlugin into your kernel:

1 using AssemblyAI.SemanticKernel;
2 using Microsoft.SemanticKernel;
3 
4 // Build your kernel
5 var kernel = Kernel.CreateBuilder();
6 
7 // Get AssemblyAI API key from env variables, or much better, from .NET configuration
8 string apiKey = Environment.GetEnvironmentVariable("ASSEMBLYAI_API_KEY")
9   ?? throw new Exception("ASSEMBLYAI_API_KEY env variable not configured.");
10 
11 kernel.ImportPluginFromObject(
12     new TranscriptPlugin(apiKey: apiKey)
13 );

Usage

Get the Transcribe function from the transcript plugin and invoke it with the context variables.

1 var result = await kernel.InvokeAsync(
2     nameof(TranscriptPlugin),
3     TranscriptPlugin.TranscribeFunctionName,
4     new KernelArguments
5     {
6         ["INPUT"] = "https://assembly.ai/espn.m4a"
7     }
8 );
9 Console.WriteLine(result.GetValue<string>());

You can get the transcript using result.GetValue<string>().

You can also upload local audio and video file. To do this:

Set the TranscriptPlugin.AllowFileSystemAccess property to true.
Configure the INPUT variable with a local file path.

1 kernel.ImportPluginFromObject(
2     new TranscriptPlugin(apiKey: apiKey)
3     {
4         AllowFileSystemAccess = true
5     }
6 );
7 var result = await kernel.InvokeAsync(
8     nameof(TranscriptPlugin),
9     TranscriptPlugin.TranscribeFunctionName,
10     new KernelArguments
11     {
12         ["INPUT"] = "https://assembly.ai/espn.m4a"
13     }
14 );
15 Console.WriteLine(result.GetValue<string>());

You can also invoke the function from within a semantic function like this.

1 string prompt = """
2                 Here is a transcript:
3                 {{TranscriptPlugin.Transcribe "https://assembly.ai/espn.m4a"}}
4                 ---
5                 Summarize the transcript.
6                 """;
7 var result = await kernel.InvokePromptAsync(prompt);
8 Console.WriteLine(result.GetValue<string>());

Additional resources

You can learn more about using Semantic Kernel with AssemblyAI in these resources: