For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
PlaygroundChangelogSign In
OverviewAPI ReferencePre-recorded STTStreaming STTVoice AgentsSpeech UnderstandingGuardrailsLLM GatewayFAQ
OverviewAPI ReferencePre-recorded STTStreaming STTVoice AgentsSpeech UnderstandingGuardrailsLLM GatewayFAQ
  • Getting started
    • Transcribe a pre-recorded audio file
    • Model selection
    • View model benchmarks
    • Evaluate model accuracy
    • Cloud endpoints & data residency
    • Manage concurrent requests
    • Webhooks
  • Models
    • Medical Mode
  • Features
    • Boost specific terms
    • Label speakers
    • Transcribe multiple audio channels
    • Transcribe audio with mixed languages
    • Correct spelling of terms
    • Include filler words
    • Search for words in transcript
    • Set the start and end of the transcript
  • Guides
      • Migration guide Deepgram to AssemblyAI
      • Migration guide OpenAI to AssemblyAI
      • Migration guide AWS Transcribe to AssemblyAI
      • Migration guide Google Speech-to-Text to AssemblyAI
      • Migration guide Gladia to AssemblyAI
LogoLogo
PlaygroundChangelogSign In
On this page
  • Get Started
  • Side-by-side code comparison
  • Installation
  • Audio File Sources
  • Basic Transcription
  • Adding Features
GuidesMigration guides

Migration guide: Google Speech-to-Text to AssemblyAI

Was this page helpful?
Previous

Migration guide: Gladia to AssemblyAI

Next
Built with

This guide walks through the process of migrating from Google Speech-to-Text (STT) to AssemblyAI for transcribing pre-recorded audio.

Get Started

Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for a free account and get your API key from your dashboard.

Side-by-side code comparison

Below is a side-by-side comparison of a basic snippet to transcribe a file by Google Speech-to-Text and AssemblyAI.

Google STT
AssemblyAI
1from google.cloud import speech
2
3client = speech.SpeechClient()
4
5audio = speech.RecognitionAudio(
6 uri="gs://cloud-samples-tests/speech/Google_Gnome.wav"
7)
8
9config = speech.RecognitionConfig(
10 encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
11 sample_rate_hertz=16000,
12 language_code="en-US",
13 model="video", # Chosen model
14)
15
16operation = client.long_running_recognize(config=config, audio=audio)
17
18print("Waiting for operation to complete...")
19response = operation.result(timeout=90)
20
21for i, result in enumerate(response.results):
22 alternative = result.alternatives[0]
23 print("-" * 20)
24 print(f"First alternative of result {i}")
25 print(f"Transcript: {alternative.transcript}")

Installation

Google STT
AssemblyAI
1from google.cloud import speech
2
3client = speech.SpeechClient()

When migrating from Google Speech-to-Text to AssemblyAI, you’ll first need to handle authentication and SDK setup:

Get your API key from your AssemblyAI dashboard. Things to know:

  • Store your API key securely in an environment variable
  • API key authentication works the same across all AssemblyAI SDKs

Audio File Sources

Google STT
AssemblyAI
1audio = speech.RecognitionAudio(uri="gs://cloud-samples-tests/speech/Google_Gnome.wav")
2
3config = speech.RecognitionConfig(
4 encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
5 sample_rate_hertz=16000,
6 language_code="en-US",
7 model="video", # Chosen model
8)
9
10operation = client.long_running_recognize(config=config, audio=audio)

Here are helpful things to know when migrating your audio input handling:

  • There’s no need to specify the audio encoding format when using AssemblyAI - we have a transcoding pipeline under the hood which works on all supported file types so that you can get the most accurate transcription.
  • You can submit a local file, URL, stream, buffer, blob, etc., directly to our transcriber. Check out some common ways you can host audio files here.
  • You can transcribe audio files that are up to 10 hours long and you can transcribe multiple files in parallel. The default amount of jobs you can transcribe at once is 200 while on the PAYG plan.

Basic Transcription

Google STT
AssemblyAI
1print("Waiting for operation to complete...")
2response = operation.result(timeout=90)
3
4for i, result in enumerate(response.results):
5 alternative = result.alternatives[0]
6 print("-" * 20)
7 print(f"First alternative of result {i}")
8 print(f"Transcript: {alternative.transcript}")

Here are helpful things to know about our transcribe method:

  • The SDK handles polling under the hood.
  • The full transcript is directly accessible via transcript.text.
  • English is the default language. We recommend specifying speech_models=["universal-3-pro", "universal-2"] for the highest accuracy.
  • We have a cookbook for error handling common errors when using our API.

Adding Features

Google STT
AssemblyAI
1config = speech.RecognitionConfig(
2 encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
3 sample_rate_hertz=8000,
4 language_code="en-US",
5 enable_speaker_diarization=True, # Speaker diarization
6 diarization_speaker_count=2, # Specify amount of speakers
7 profanity_filter=True # Remove profanity from transcript
8)

Key differences:

  • Use aai.TranscriptionConfig to specify any extra features that you wish to use.
  • The results for Speaker Diarization are stored in transcript.utterances. To see the full transcript response object, refer to our API Reference.
  • Check our documentation for our full list of available features and their parameters.