Transcribe streaming audio from a microphone in Go
Learn how to transcribe streaming audio in Go.
Overview
By the end of this tutorial, you’ll be able to transcribe audio from your microphone in Go.
Supported languages
Streaming Speech-to-Text is only available for English.
Before you begin
To complete this tutorial, you need:
- Go installed.
- An AssemblyAI account with a credit card set up.
You can download the full sample code from GitHub.
Step 1: Install dependencies
Step 2: Create a transcriber
In this step, you’ll define a transcriber to handle real-time events.
Browse to Account, and then click Copy API key under Copy your API key.
Create a new RealTimeClient using the function you created. Replace YOUR_API_KEY
with your copied API key.
Sample rate
Sample rate is the number of audio samples per second, measured in hertz (Hz). Higher sample rates result in higher quality audio, which may lead to better transcripts, but also more data being sent over the network. By default, the SDK uses a sample rate of 16 kHz. You can set your own sample rate using the WithSampleRate
option.
We recommend the following sample rates:
- Minimum quality:
8_000
(8 kHz) - Medium quality:
16_000
(16 kHz) - Maximum quality:
48_000
(48 kHz)
Step 3: Connect the transcriber
To stream audio to AssemblyAI, you first need to establish a connection to the API using client.Connect()
.
You’ve set up the transcriber to handle real-time events, and connected it to the API. Next, you’ll create a recorder to record audio from your microphone.
Step 4: Record audio from microphone
In this step, you’ll configure your Go app to record audio from your microphone. You’ll use the gordonklaus/portaudio module to make this easier.
In main.go
, open a microphone stream. The sampleRate
must be the same value as the one you passed to RealTimeClient
(16_000
by default).
Audio data format
The recorder formats the audio data for you. If you want to stream data from elsewhere, make sure that your audio data is in the following format:
- Single channel
- 16-bit signed integer PCM or mu-law encoding
Read data from the microphone stream, and send it to AssemblyAI for transcription using client.Send()
.
Step 5: Disconnect the transcriber
In this step, you’ll clean up resources by stopping the recorder and disconnecting the transcriber.
To disconnect the transcriber on Ctrl+C
, use client.Disconnect()
. Disconnect()
accepts a boolean parameter that allows you to wait for any remaining transcriptions before closing the connection.
Run your application to start transcribing. Your OS may require you to allow your app to access your microphone. If prompted, click Allow.
You can also find the source code for this tutorial on GitHub.
Need some help?
If you get stuck, or have any other questions, we’d love to help you out. Contact our support team at support@assemblyai.com or create a support ticket.