Transcribe from an S3 Bucket

AssemblyAI’s Speech-to-Text APIs can be used with both local files and publicly accessible online files, but what if you want to transcribe an audio file that has restricted access? Luckily, you can do this with AssemblyAI too!

Read on to learn how you can transcribe an audio file stored in an AWS S3 bucket using AssemblyAI’s APIs.

Intro

In order to transcribe an audio file from an S3 bucket, AssemblyAI will need temporary access to the file. To provide this access, we’ll generate a presigned URL, which is simply a URL that has temporary access rights baked-in.

The overall process looks like this:

  1. Generate a presigned URL for the S3 audio file with boto.
  2. Pass this URL through to AssemblyAI’s API with a POST request.
  3. Wait until the transcription is complete, and then fetch it with a GET request.

Prerequisites

First, you’ll need an AssemblyAI account. You can sign up here for a free account if you don’t already have one.

Next, you’ll need to take note of your AssemblyAI API key, which you can find on your account dashboard after signing in. It will be on the left-hand side of the screen under Your API Key.

You’ll need the value of this key later, so leave the browser window open or copy the value into a text file.

AWS IAM User

Second, you’ll need an AWS IAM user with Programmatic access and the AmazonS3ReadOnlyAccess permission. If you already have such an IAM user and you know its public and private keys, then you can move on to the next section. Otherwise, create one now as follows:

First, log into AWS as a root user or as another IAM user with the appropriate access, and then go to the IAM Management Console to add a new user.

image

Set the user name you would like, and select Programmatic access under Select AWS access type:

image

Click Next, and then Attach existing policies directly. Copy and paste AmazonS3ReadOnlyAccess into the Filter policies search box, and then add this permission by clicking on the checkbox next to it:

image

Click Next and add tags if you wish. Then click Next and review the IAM user profile to ensure that everything looks copacetic before clicking Create user.

image

Finally, take note of the IAM user’s Access key ID and Secret access key. Again, we will need these values later, so copy them into a text file before moving on.

Warning Make sure to copy the IAM user’s Secret access key and record it somewhere safe. Once you close the final window of the Add user sequence, you will not be able to access this key again and will need to regenerate it if you forget/lose the original.

Code

First, the necessary packages are installed.

$pip install -U boto3 botocore

Then we can import them and set our relevant variable values. You’ll need to edit these variables to be equivalent to the relevant values for your application:

  1. bucket_name - The name of your AWS S3 bucket.
  2. object_name - The name of the audio file in the S3 bucket that you want to transcribe.
  3. iam_access_id - The access ID of the IAM user with programmatic access and S3 read permission.
  4. iam_secret_key - The secret key of the IAM user.
  5. assembly_key - Your AssemblyAI API key.
1import boto3
2from botocore.exceptions import ClientError
3import logging
4import requests
5import time
6
7bucket_name = "<BUCKET_NAME>"
8object_name = "<AUDIO_FILE_NAME>"
9
10iam_access_id = "<IAM_ACCESS_ID>"
11iam_secret_key = "<IAM_SECRET_KEY>"
12
13assembly_key = "<ASSEMBLYAI_API_KEY>"

From here, we simply follow the sequence outlined in the introduction of this Colab:

  1. Generate a presigned URL for the S3 audio file with boto.
1# Create a low-level service client with the IAM credentials.
2s3_client = boto3.client(
3 "s3", aws_access_key_id=iam_access_id, aws_secret_access_key=iam_secret_key
4)
5
6# Generate a pre-signed URL for the audio file that expires after 30 minutes.
7try:
8 p_url = s3_client.generate_presigned_url(
9 ClientMethod="get_object",
10 Params={"Bucket": bucket_name, "Key": object_name},
11 ExpiresIn=1800,
12 )
13
14except ClientError as e:
15 logging.error(e)
  1. Pass the presigned URL through to AssemblyAI’s API with a POST request.
1# Use your AssemblyAI API Key for authorization.
2headers = {"authorization": assembly_key, "content-type": "application/json"}
3
4# Specify AssemblyAI's transcription API endpoint.
5transcript_endpoint = "https://api.assemblyai.com/v2/transcript"
6
7# Use the presigned URL as the `audio_url` in the POST request.
8json = {"audio_url": p_url}
9
10# Queue the audio file for transcription with a POST request.
11post_response = requests.post(transcript_endpoint, json=json, headers=headers)
  1. Wait until the transcription is complete, and then fetch it with a GET request.
1# Specify the endpoint of the transaction.
2get_endpoint = transcript_endpoint + "/" + post_response.json()["id"]
3
4# GET request the transcription.
5get_response = requests.get(get_endpoint, headers=headers)
6
7# If the transcription has not finished, wait util it has.
8while get_response.json()["status"] != "completed":
9 get_response = requests.get(get_endpoint, headers=headers)
10 time.sleep(5)
11
12# Once the transcription is complete, print it out.
13print(get_response.json()["text"])