We recently announced our latest speech model, Universal-1, which sets a new standard for speech-to-text accuracy. Trained on millions of hours of audio data, Universal-1 demonstrates near-human accuracy, even with accented speech, background noise, and difficult phrases like flight numbers and email addresses.
Universal-1 is also an order of magnitude faster than our previous model, Conformer-2, and supports English, Spanish, French, and German, with more languages coming shortly.
Along with Universal-1, we’ve also introduced two new pricing tiers: Best and Nano.
- Best lets you take advantage of Universal-1 for applications where accuracy is paramount.
- Nano is our new cost-effective tier with support for 99 different languages.
In this post, you’ll learn how to transcribe an audio file in your Go applications using Universal-1 and Nano.
Set up the AssemblyAI Go SDK
The easiest way to start transcribing audio is by using one of our official SDKs.
To install the AssemblyAI Go SDK, run the following command in the same directory as your Go project:
go get github.com/AssemblyAI/assemblyai-go-sdk
Import the Go module in your project:
import (
aai "github.com/AssemblyAI/assemblyai-go-sdk"
)
Configure a new authenticated SDK client using your API key from your account dashboard.
apiKey := os.Getenv("ASSEMBLYAI_API_KEY")
client := aai.NewClient(apiKey)
You’ll find all the operations you need on the client instance, such as the TranscribeFromURL and TranscribeFromReader.
Transcribe an audio file using Universal-1
By default, all transcriptions use the Best tier, so you’ll always get the highest accuracy without any extra configuration.
To transcribe an audio file from a URL using Best tier, create a new file called main.go
with the following code:
package main
import (
"context"
"fmt"
"os"
aai "github.com/AssemblyAI/assemblyai-go-sdk"
)
func main() {
apiKey := os.Getenv("ASSEMBLYAI_API_KEY")
client := aai.NewClient(apiKey)
ctx := context.Background()
audioURL := "https://storage.googleapis.com/aai-web-samples/5_common_sports_injuries.mp3"
params := &aai.TranscriptOptionalParams{}
transcript, err := client.Transcripts.TranscribeFromURL(ctx, audioURL, params)
if err != nil {
fmt.Println("Something went wrong while transcribing:", err)
os.Exit(1)
}
fmt.Println(aai.ToString(transcript.Text))
}
If you instead want to transcribe a local file, use TranscribeFromReader
:
package main
import (
"context"
"fmt"
"os"
aai "github.com/AssemblyAI/assemblyai-go-sdk"
)
func main() {
apiKey := os.Getenv("ASSEMBLYAI_API_KEY")
client := aai.NewClient(apiKey)
ctx := context.Background()
f, err := os.Open("./audio.mp3")
if err != nil {
fmt.Println("Something went wrong while opening the file:", err)
os.Exit(1)
}
defer f.Close()
params := &aai.TranscriptOptionalParams{}
transcript, err := client.Transcripts.TranscribeFromReader(ctx, f, params)
if err != nil {
fmt.Println("Something went wrong while transcribing:", err)
os.Exit(1)
}
fmt.Println(aai.ToString(transcript.Text))
}
Nano—a cost-effective speech-to-text alternative
Switching between Best and Nano is only a matter of setting SpeechModel
in your transcription parameters.
params := &aai.TranscriptOptionalParams{
SpeechModel: aai.SpeechModelNano,
}
Best or Nano, which one is right for you?
With two speech-to-text options, you might wonder which one you should use for your application.
We recommend using Best for applications where it’s critical to get accurate, high-quality transcripts—for example, when you want to display the transcript to your end user.
If you have high-volume transcriptions and looking to reduce costs, or if you need additional language support, we encourage you to try Nano.
We encourage you to compare the results to find which one works best for the application you’re building.
To read more about Universal-1, see our research article.