Get YouTube Video Transcripts with yt-dlp
In this guide, we’ll show you how to transcribe YouTube videos.
For this, we use the yt-dlp library to download YouTube videos and then transcribe it with the AssemblyAI API.
yt-dlp
is a youtube-dl fork with additional features and fixes. It is better maintained and preferred over youtube-dl
nowadays.
In this guide we’ll show 2 different approaches:
- Option 1: Download video via CLI
- Option 2: Download video via code
Let’s get started!
Quickstart
Step-by-step guide
Install Dependencies
Install yt-dlp and the AssemblyAI Python SDK via pip.
Option 1: Download video via CLI
In this approach we download the YouTube video via the command line and then transcribe it via the AssemblyAI API. We use the following video here:
To download it, use the yt-dlp
command with the following options:
-f m4a/bestaudio
: The format should be the best audio version in m4a format.-o "%(id)s.%(ext)s"
: The output name should be the id followed by the extension. In this example, the video gets saved to “wtolixa9XTg.m4a”.wtolixa9XTg
: the id of the video.
Next, set up the AssemblyAI SDK and trancribe the file. Replace YOUR_API_KEY
with your own key. If you don’t have one, you can sign up here for free.
Make sure that the path you pass to the transcribe()
function corresponds to the saved filename.
Option 2: Download video via code
In this approach we download the video with a Python script instead of the command line.
You can download the file with the following code:
After downloading, you can use the same code from option 1 to transcribe the file: