Transcribing an audio file

In this guide, we’ll show you how to use the API to transcribe your audio files.

You can also learn the content on this page from How to Transcribe Audio Files with Python on AssemblyAI’s YouTube channel.

If you’re using Python or TypeScript, see Transcribe an audio file.

Get started

Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for a free account and get your API key from your dashboard.

The entire source code of this guide can be viewed here.

Step-by-step instructions

1

Install the SDK.

$pip install -U assemblyai
2

Import the assemblyai package and set the API key.

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
3

Create a Transcriber object.

1transcriber = aai.Transcriber()
4

Use the Transcriber object’s transcribe method and pass in the audio file’s path as a parameter. The transcribe method saves the results of the transcription to the Transcriber object’s transcript attribute.

wordHighlight="./my-audio.mp3"
1transcript = transcriber.transcribe("./my-audio.mp3")
5

Alternatively, you can pass in a path to an audio file saved on the internet.

1transcript = transcriber.transcribe("https://example.org/audio.mp3")
6

You can access the transcription results through the Transcriber object’s text attribute.

1print(transcript.text)

Understanding the response

The AssemblyAI API returns JSON-formatted output. Your transcription will be located in the text key. You’ll also find a timestamp and a confidence score for each word inside of the words key, as well as other parameters assigned by the API such as language_code and language_model.

Refer to the API reference for a breakdown of every element in your transcript output.

Best practices

When using the AssemblyAI API to transcribe audio files, we recommended using the polling technique to check for the status of the transcription. This means making a request every few seconds to check if the transcription is complete, as described above.

Alternatively, you can also set up webhooks to receive notifications when the transcription is complete. This can help reduce the overhead of polling and make your application more efficient.

Conclusion

Transcription is our core API use case, and nearly all other AssemblyAI features leverage our transcription functionality. We’re constantly improving and updating the language models used by our transcription engine. Of course, higher quality audio generally produces better results.

We’d love to hear about any new integrations or solutions that you build using our transcription API — you can find us on Twitter or apply to join our Creators Program. You can also try out the AssemblyAI Playground to experiment with our transcription features without needing to write any code! If you encounter any issues or have any questions, see FAQ or reach out to our Support team.

Was this page helpful?
Built with