Transcribe Your Zoom Meetings

This guide creates a Node.js service that captures audio from Zoom Real-Time Media Streams (RTMS) and provides both real-time and asynchronous transcription using AssemblyAI.

Zoom RTMS Documentation

For complete Zoom RTMS documentation, visit https://developers.zoom.us/docs/rtms/

Features

  • Real-time Transcription: Live transcription during meetings using AssemblyAI’s streaming API
  • Asynchronous Transcription: Complete post-meeting transcription with advanced features
  • Flexible Audio Modes:
    • Mixed stream (all participants combined)
    • Individual participant streams transcribed
  • Multichannel Audio Support: Separate channels for different participants
  • Configurable Processing: Enable/disable real-time or async transcription independently

Setup

Prerequisites

  • Node.js 16+
  • FFmpeg installed on your system
  • Zoom RTMS Developer Preview access
  • AssemblyAI API key
  • ngrok (for local development and testing)

Installation

  1. Clone the example repository and install dependencies:
$git clone https://github.com/zkleb-aai/assemblyai-zoom-rtms.git
>cd assemblyai-zoom-rtms
>npm install
  1. Configure environment variables:
$cp .env.example .env

Fill in your .env file:

1# Zoom Configuration
2ZM_CLIENT_ID=your_zoom_client_id
3ZM_CLIENT_SECRET=your_zoom_client_secret
4ZOOM_SECRET_TOKEN=your_webhook_secret_token
5
6# AssemblyAI Configuration
7ASSEMBLYAI_API_KEY=your_assemblyai_api_key
8
9# Service Configuration
10PORT=8080
11REALTIME_ENABLED=true
12REALTIME_MODE=mixed
13ASYNC_ENABLED=true
14AUDIO_CHANNELS=mono
15AUDIO_SAMPLE_RATE=16000
16TARGET_CHUNK_DURATION_MS=100

Local development with ngrok

For testing and development, you can use ngrok to expose your local server to the internet:

  1. Install ngrok: Download from ngrok.com or install via package manager:

    $# macOS
    >brew install ngrok
    >
    ># Windows (chocolatey)
    >choco install ngrok
    >
    ># Or download directly from ngrok.com
  2. Start your local server:

    $npm start
  3. In a separate terminal, start ngrok:

    $ngrok http 8080
  4. Copy the ngrok URL: ngrok will display a forwarding URL like:

    Forwarding https://example-abc123.ngrok-free.app -> http://localhost:8080
  5. Use the ngrok URL in your Zoom app webhook configuration:

    https://example-abc123.ngrok-free.app/webhook

Configuration options

Real-time transcription

  • REALTIME_ENABLED: Enable/disable live transcription (default: true)
  • REALTIME_MODE:
    • mixed: Single stream with all participants combined
    • individual: Separate streams per participant

Audio settings

  • AUDIO_CHANNELS: mono or multichannel
  • AUDIO_SAMPLE_RATE: Audio sample rate in Hz (default: 16000)
  • TARGET_CHUNK_DURATION_MS: Audio chunk duration for streaming (default: 100)

Async transcription

  • ASYNC_ENABLED: Enable/disable post-meeting transcription (default: true)

Usage

Start the service

$npm start

The service will start on the configured port (default: 8080) and display:

🎧 Zoom RTMS to AssemblyAI Transcription Service
📋 Configuration:
Real-time: ✅ (mixed)
Audio: mono @ 16000Hz
Async: ✅
🚀 Server running on port 8080
📡 Webhook endpoint: http://localhost:8080/webhook

Configure Zoom webhook

  1. In your Zoom App configuration, set the webhook endpoint to:

    # For production
    https://your-domain.com/webhook
    # For local development with ngrok
    https://example-abc123.ngrok-free.app/webhook
  2. Subscribe to these events:

    • meeting.rtms_started
    • meeting.rtms_stopped

Testing with ngrok

When using ngrok for testing:

  1. Keep ngrok running: The ngrok tunnel must remain active during testing
  2. Update webhook URL: If you restart ngrok, you’ll get a new URL that needs to be updated in your Zoom app configuration
  3. Monitor ngrok logs: ngrok shows incoming webhook requests in its terminal output
  4. Free tier limitations: The free ngrok tier has some limitations; consider upgrading for heavy testing

Real-time output

During meetings, you’ll see live transcription:

🚀 AssemblyAI session started: [abc12345]
🎙️ [abc12345] Hello everyone, welcome to the meeting
📝 [abc12345] FINAL: Hello everyone, welcome to the meeting.

Post-meeting files

After each meeting, the service generates:

  • transcript_[meeting_uuid].json - Full AssemblyAI response with metadata
  • transcript_[meeting_uuid].txt - Plain text transcript

Advanced configuration

AssemblyAI features

Modify the ASYNC_CONFIG object in the code to enable additional features:

1const ASYNC_CONFIG = {
2 speaker_labels: true, // Speaker identification
3 auto_chapters: true, // Automatic chapter detection
4 sentiment_analysis: true, // Sentiment analysis
5 entity_detection: true, // Named entity recognition
6 redact_pii: true, // PII redaction
7 summarization: true, // Auto-summarization
8 auto_highlights: true, // Key highlights
9};

See AssemblyAI’s API documentation for all available options.

Audio processing modes

Mixed mode (default)

  • Single audio stream combining all participants
  • Most efficient for general transcription
  • Best for meetings with clear speakers

Individual mode

  • Separate transcription stream per participant
  • Better speaker attribution
  • Higher resource usage

Multichannel audio

  • Separate audio channels for different participants
  • Enables advanced speaker separation
  • Requires AUDIO_CHANNELS=multichannel

API endpoints

POST /webhook

Handles Zoom RTMS webhook events:

  • URL validation
  • Meeting start/stop events
  • Automatic RTMS connection setup

Error handling

The service includes comprehensive error handling:

  • Automatic reconnection for dropped connections
  • Graceful cleanup on meeting end
  • Audio buffer flushing to prevent data loss
  • Temporary file cleanup

Monitoring

Real-time logs

  • Connection status updates
  • Audio processing statistics
  • Transcription progress
  • Error notifications

Example log output

📡 Connecting to Zoom signaling for meeting abc123
✅ Zoom signaling connected for meeting abc123
🎵 Connecting to Zoom media for meeting abc123
✅ Zoom media connected for meeting abc123
🚀 Started audio streaming for meeting abc123
🎵 [abc12345] 100 chunks, 32768 bytes, 10.2s
📝 [abc12345] FINAL: This is the final transcription.

Development workflow

  1. Start your local server: npm start
  2. Start ngrok in another terminal: ngrok http 8080
  3. Update your Zoom app webhook URL with the ngrok URL
  4. Test with Zoom meetings
  5. Monitor logs in both your app and ngrok terminals