Generating subtitles for videos

You can export your completed transcripts in SRT or VTT format, which can be used for subtitles and closed captions in videos. Once your transcript status shows as completed, you can make a request to the appropriate endpoint to export your transcript in SRT or VTT format.

In this guide, we’ll walk through the process of generating subtitles for videos using the AssemblyAI API.

SRT and VTT Subtitles Formats

How do SRT files work?

SRT (SubRip Text) files are commonly used to store subtitles for videos. The format is plain text, and it contains the timing information for each subtitle along with the subtitle text itself.

Here’s a breakdown of how the format works:

  • Each subtitle entry consists of an index number, start time, end time, and text.
  • The index number is a sequential number starting from 1.
  • The start and end times are given in the format hours:minutes:seconds,milliseconds and are separated by -->.
  • The text that follows the timing information is the subtitle text itself, and it may span multiple lines.
  • Entries are separated by a blank line.

How do VTT formats work?

WEBVTT (Web Video Text Tracks), which is a standard format for displaying timed text tracks (such as subtitles or captions) within HTML5 video.

The syntax is similar to SRT but has some differences:

  • The file should begin with the header WEBVTT.
  • Timing is done with a period (.) separating seconds and milliseconds instead of a comma (,).
  • No blank lines are needed between entries.
  • No index numbers are required.
  • This format is supported by many modern browsers and can be used with the HTML5 <track> element to add subtitles to a <video> element.

If you’re planning to upload this file to YouTube, you should be able to use it just like an SRT file. YouTube supports various subtitle formats, including WEBVTT.

Get started

Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for a free account and get your API key from your dashboard.

The entire source code of this guide can be viewed here.

Step-by-step instructions

1

Install the SDK.

1pip install -U assemblyai
2

Set up the API endpoint and headers. The headers should include your API key.

1import assemblyai as aai
2
3aai.settings.api_key = "<YOUR_API_KEY>"
3

Create a Transcriber object.

1transcriber = aai.Transcriber()
4

Use the Transcriber object’s transcribe method and pass in the audio file’s path as a parameter. The transcribe method saves the results of the transcription to the Transcriber object’s transcript attribute.

wordHighlight="./my-audio.mp3"
1transcript = transcriber.transcribe("./my-audio.mp3")
5

Alternatively, you can pass in the URL of a publicly accessible audio file on the internet.

1transcript = transcriber.transcribe("https://example.org/audio.mp3")
6

Export SRT subtitles with the export_subtitles_srt method.

1print(transcript.export_subtitles_srt())
7

Export VTT subtitles with the export_subtitles_vtt method.

1print(transcript.export_subtitles_vtt())

Advanced usage

You can also customize the maximum number of characters per caption using the chars_per_caption URL parameter in your API requests to either the SRT or VTT endpoints. For example, adding ?chars_per_caption=32 to the SRT endpoint URL ensures that each caption has no more than 32 characters.

Conclusion

The AssemblyAI produces subtitles as both .srt and .vtt files. These are standard subtitle formats, and can be used with videos both on and off the web. For example, after generating your subtitle file, you can add it to a Mux video using their platform, or you can use ffmpeg to embed it in a local video file. Subtitle formats contain plain text, so you can

Was this page helpful?
Built with