SRT files are widely used subtitle file formats for videos. In this guide, we'll show you how to create SRT(.srt) files for videos in Python.
What is an SRT file?
An SRT file or SubRip file is one of the most common types of subtitle file formats for videos, generally saved with the .srt extension. The format contains human-readable plain text that provides the timing information for each subtitle along with the subtitle text itself.
Here's a breakdown of how the format works:
- Each subtitle entry consists of an index number, start time, end time, and text.
- The index number is a sequential number starting from 1.
- The start and end times are given in the format
hours:minutes:seconds,milliseconds
and are separated by-->
. - The text that follows the timing information is the subtitle text itself, and it may span multiple lines.
- Entries are separated by a blank line.
SRT files make it possible to add subtitles to video content after it is produced. For example, they can be uploaded to YouTube videos to add missing subtitles or replace existing ones with higher-quality ones.
Example of an SRT file
This is what the first lines of the SRT file for this YouTube video look like:
1
00:00:00,170 --> 00:00:04,234
AssemblyAI is building AI systems to help you build AI applications
2
00:00:04,282 --> 00:00:08,106
with spoken data. We create superhuman AI models for speech
Prerequisites to get SRT files in Python
We will use the following dependencies to complete this tutorial:
- The AssemblyAI Python SDK
- A free AssemblyAI API key, which can be copied from your AssemblyAI dashboard
All code in this blog post is also available on GitHub under the subtitle generation guide of the AssemblyAI cookbook repository.
If you want a working code example that transcribes and generates subtitles for YouTube videos, you can check out this Google Colab.
Project Setup for SRT generation
Make sure that you have Python 3.8 or newer already installed on your system and create a new folder for your project. Then, navigate to your project directory in your terminal and create a new virtual environment:
# Mac/Linux:
python3 -m venv venv
. venv/bin/activate
# Windows:
python -m venv venv
.\venv\Scripts\activate.bat
Install the AssemblyAI Python package
pip install assemblyai
Set your AssemblyAI API key as an environment variable named ASSEMBLYAI_API_KEY
. You can get a free API key here.
# Mac/Linux:
export ASSEMBLYAI_API_KEY=<YOUR_KEY>
# Windows:
set ASSEMBLYAI_API_KEY=<YOUR_KEY>
Create the SRT files for videos in Python
AssemblyAI can produce subtitles as both SRT and VTT files.
First, you'll need the video file for which you want to create the SRT file. You can either use a path to a local file or a URL to a publicly accessible file. The AssemblyAI API supports most common audio and video file formats, so you can submit both audio or video files to generate SRT files. You’ll find all supported file formats in our API documentation.
Create a new file named main.py
and insert the following code:
import assemblyai as aai
# If the API key is not set as an environment variable named
# ASSEMBLYAI_API_KEY, you can also set it like this:
# aai.settings.api_key = "YOUR_API_KEY"
transcriber = aai.Transcriber()
transcript = transcriber.transcribe("https://storage.googleapis.com/aai-web-samples/aai-overview.mp4")
srt = transcript.export_subtitles_srt()
# Save it to a file
with open("subtitle_example.srt", "w") as f:
f.write(srt)
The above code first imports the assemblyai
Python package. Next, it instantiates the Transcriber
object, which is used to call AssemblyAI's transcription service.
Calling transcriber.transcribe()
starts the transcription process on the specified video file. Here, we used a remote URL, but you can replace it with the path to your own file.
When the transcription is finished, it gets saved in the transcript
object. Calling transcript.export_subtitles_srt()
then generates the subtitles in SRT format.
Lastly, the script dumps the SRT string into a .srt file. You can modify the output filename to your liking.
Run the subtitle generation code
Ensure that the main.py
file is saved, that your virtual environment is still activated, and that your API key is set. Navigate to the project directory in a terminal and run the main.py
file with the following command:
python main.py
Once the script has finished executing, you should see a new .srt file with the generated subtitles in your folder.
Specify the number of characters per caption in the SRT file
You can also customize the maximum number of characters per caption by specifying the chars_per_caption
parameter. For example:
srt = transcript.export_subtitles_srt(chars_per_caption=32)
The captions are then limited to 32 characters:
1
00:00:00,170 --> 00:00:01,754
AssemblyAI is building AI
2
00:00:01,802 --> 00:00:03,514
systems to help you build AI
Wrapping Up
In this tutorial, you’ve learned how to generate SRT files for videos with AssemblyAI using Python.
Here are a few other helpful resources to learn more about what you can do with transcripts and AssemblyAI’s Speech AI models:
- API documentation for subtitle generation
- Automatically determine video sections with AI using Python
- Key phrase detection in audio files using Python
Alternatively, check out other content on our blog or YouTube channel to learn more about AI, or feel free to join us on Twitter or Discord to stay in the loop when we release new content.