Tracking customer transcription usage | AssemblyAI

This guide explains how to track individual customer usage within your product for billing purposes.

This is the recommended approach for tracking customer usage. Creating separate API keys for each of your customers is not an optimal strategy for usage tracking, as it adds unnecessary complexity and makes it harder to manage your account.

There are two separate methods depending on which transcription approach you use:

Async transcription: Use webhooks with custom query parameters to associate transcriptions with customers, then retrieve the audio_duration from the transcript response.
Streaming transcription: Manage customer IDs in your application state and capture the session_duration_seconds from the WebSocket Termination event.

This guide covers both methods in detail.

Async transcription usage tracking

By combining webhooks with custom metadata, you can track audio duration per customer and monitor their usage of your transcription service.

Step 1: Set up webhooks with customer metadata

When submitting a transcription request, include your webhook URL with the customer ID as a query parameter. This allows you to associate each transcription with a specific customer.

Python SDK

Python

JavaScript SDK

JavaScript

1 import assemblyai as aai
2 
3 aai.settings.api_key = "<YOUR_API_KEY>"
4 
5 # Add customer_id as a query parameter to your webhook URL
6 webhook_url = "https://your-domain.com/webhook?customer_id=customer_123"
7 
8 config = aai.TranscriptionConfig(
9     speech_models=["universal-3-pro", "universal-2"],
10     language_detection=True,
11 ).set_webhook(webhook_url)
12 
13 # Submit without waiting for completion
14 aai.Transcriber().submit("https://example.com/audio.mp3", config)

You can add multiple query parameters to track additional information:

https://your-domain.com/webhook?customer_id=123&project_id=456&order_id=789

This allows you to track usage across multiple dimensions (customer, project, order, etc.).

Step 2: Handle the webhook delivery

When the transcription completes, AssemblyAI sends a POST request to your webhook URL with the following payload:

1 {
2   "transcript_id": "5552493-16d8-42d8-8feb-c2a16b56f6e8",
3   "status": "completed"
4 }

Extract both the transcript_id from the payload and the customer_id from your URL query parameters.

Step 3: Retrieve the transcript with audio duration

Use the transcript ID to fetch the complete transcript details, which includes the audio_duration field (in seconds).

Python SDK

Python

JavaScript SDK

JavaScript

1 import assemblyai as aai
2 
3 aai.settings.api_key = "<YOUR_API_KEY>"
4 
5 # Get transcript using the ID from webhook
6 transcript = aai.Transcript.get_by_id("<TRANSCRIPT_ID>")
7 
8 if transcript.status == aai.TranscriptStatus.completed:
9     audio_duration = transcript.audio_duration  # Duration in seconds
10     # Use audio_duration for billing/tracking

Step 4: Track usage per customer

In your webhook handler, combine the customer ID from your webhook URL query parameters with the audio duration from the transcript to record usage:

Extract the customer_id from the webhook URL query parameters
Extract the transcript_id from the webhook payload
If the status is completed, fetch the transcript using the SDK to get the audio_duration
Store the usage record in your database with the customer ID, transcript ID, audio duration, and timestamp

This allows you to aggregate usage per customer for billing purposes.

Streaming transcription usage tracking

Unlike async transcription which uses webhooks, streaming transcription requires a different approach. You’ll track usage by managing customer IDs in your own application state/session management, capturing the session_duration_seconds from the Termination event, and associating the duration with the customer ID for billing/tracking. AssemblyAI bills streaming based on session duration, so this is the metric you should track.

Step 1: Set up your WebSocket connection

Connect to AssemblyAI’s streaming service. The customer ID is managed entirely in your application and is never sent to AssemblyAI.

1 import websocket
2 import json
3 from urllib.parse import urlencode
4 from datetime import datetime
5 
6 # Configuration
7 YOUR_API_KEY = "<YOUR_API_KEY>"
8 
9 CONNECTION_PARAMS = {
10     "sample_rate": 16000,
11     "format_turns": True,
12 }
13 
14 API_ENDPOINT = f"wss://streaming.assemblyai.com/v3/ws?{urlencode(CONNECTION_PARAMS)}"

Step 2: Capture audio duration from the Termination event

The key to tracking usage is capturing the audio_duration_seconds field from the Termination message. This is sent when the streaming session ends.

1 def on_message(ws, message):
2     """Handle WebSocket messages"""
3     try:
4         data = json.loads(message)
5         msg_type = data.get("type")
6 
7         if msg_type == "Begin":
8             session_id = data.get("id")
9             print(f"Session started: {session_id}")
10 
11         elif msg_type == "Turn":
12             transcript = data.get("transcript", "")
13             if data.get("turn_is_formatted"):
14                 print(f"Transcript: {transcript}")
15 
16         elif msg_type == "Termination":
17             # Extract audio duration - this is what you need for billing
18             audio_duration_seconds = data.get("audio_duration_seconds", 0)
19             session_duration_seconds = data.get("session_duration_seconds", 0)
20 
21             print(f"\nSession terminated:")
22             print(f"  Audio Duration: {audio_duration_seconds} seconds")
23             print(f"  Session Duration: {session_duration_seconds} seconds")
24 
25             # Here you would associate audio_duration_seconds with your customer
26             # using whatever session management system you have in place
27             customer_id = get_customer_id_from_session()  # Your implementation
28             log_customer_usage(customer_id, session_duration_seconds)
29 
30     except json.JSONDecodeError as e:
31         print(f"Error decoding message: {e}")
32     except Exception as e:
33         print(f"Error handling message: {e}")

Step 3: Log customer usage

When you receive the Termination event, store the session duration for billing/tracking:

Retrieve the customer ID from your session management system (authentication tokens, session cookies, etc.)
Extract the session_duration_seconds from the Termination event
Store the usage record in your database with the customer ID, session duration, and timestamp

Since AssemblyAI bills streaming based on session_duration_seconds, this is the metric you should track for accurate billing.

Session duration vs audio duration

From the Termination event, you receive two fields:

Field	Description
`session_duration_seconds`	Total time the session was open
`audio_duration_seconds`	Total seconds of audio actually processed

Streaming transcription is billed based on session_duration_seconds, not audio_duration_seconds. Make sure you track the correct metric for accurate billing.

Session management

You need to implement your own session management to associate WebSocket connections with customer IDs. This could be through user authentication tokens, session cookies, database lookups, or in-memory session stores. Track the customer ID throughout the WebSocket lifecycle so you can associate it with the session duration when the Termination event arrives.

Proper session termination

Always close sessions properly to ensure you receive the Termination event and avoid unexpected costs:

1 # Send termination message when done
2 terminate_message = {"type": "Terminate"}
3 ws.send(json.dumps(terminate_message))

Best practices

When implementing billing tracking, consider the following best practices:

Store the transcript/session ID: Always store the identifier alongside usage records. This allows you to audit and verify billing data.
Handle errors gracefully: If a transcription fails (status: "error"), don’t bill the customer for that request. You may want to log failed transcriptions for debugging.
Secure your webhooks: Use the webhook_auth_header_name and webhook_auth_header_value parameters to verify that webhook requests are from AssemblyAI.
Consider time zones: Store timestamps in UTC to avoid confusion when generating billing reports.

Next steps

Learn more about webhooks and their configuration options
Explore the Submit Transcript API for async transcription
Explore the Get Transcript API for retrieving transcript details
Review the Streaming API for real-time transcription