Browser integration
Connect browser-based apps to the Voice Agent API using a temporary token.
Connect a browser to the Voice Agent API in two steps:
- Your server calls
GET /v1/tokenwith your API key to mint a short-lived temporary token. - Your browser opens the WebSocket with
?token=<token>— no API key exposed.
Your API key never leaves your server. Each token is single-use — it starts exactly one session, and all usage is attributed to the key that generated it.
1. Generate a token on your server
Call GET /v1/token with your API key in the Authorization header. Pick an expires_in_seconds short enough to limit replay risk (60–300s is a good default) and an optional max_session_duration_seconds to cap the session length.
expires_in_seconds must be between 1 and 600. max_session_duration_seconds must be between 60 and 10800 (defaults to 10800, the 3-hour maximum session duration).
2. Connect from the browser with the token
Fetch the token from your server, then open the WebSocket with ?token=<token>. No Authorization header is needed.
See the Overview quickstart for the full event loop, and Audio format for capturing mic input and playing the agent’s response in a browser.
Fetch a fresh token for every new WebSocket connection. Tokens are single-use — a dropped connection needs a new token to reconnect (including when using session.resume).