Transcribe Multiple Files Simultaneously Using the Python SDK
In this guide, we’ll show you how to use the AssemblyAI API to transcribe multiple audio files at once. This guide focuses on demonstrating how to use the AssemblyAI Python SDK to acheive this.
You can also look at an alternative method to acheive this with Webhooks and integrating a server API here.
Quickstart
Get Started
Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for a free account and get your API key from your dashboard.
Step-by-Step Guide
Install the SDK.
Import the assemblyai
package and set the API key. Import threading
and OS
Python libraries that enable concurrent task processing and file path interactions respectively.
Set the folders. The batch
folder contains the audio files that you want to process and transcribe. The transcription_result_folder
stores the .txt transcript files.
Create a Transcriber
object.
Function to transcribe an audio file. Once the transcript is complete, a .txt file is generated to the transcription_result_folder
. If there is an error with the transcription, it will not be processed to the results folder.
Open threads to transcribe each file concurrently. Once all the threads are complete you will receive the “All transcriptions are complete” message in your terminal.
Conclusion
This guide aims to demonstrate how to use AssemblyAI Python SDK to concurrently process multiple audio files at once. The output is transcript text files for each audio file in the specified folder.
Other integrations and features can be built on top of this main function. These include and are not limited to: exporting the file in different formats, adding Core Transcription or Audio Intelligence features.
If you have any questions, please feel free to reach out to our Support team - support@assemblyai.com or in our Community Discord!