Do you offer voice-to-voice or text-to-speech (TTS)?
Do you offer voice-to-voice or text-to-speech (TTS)?
Yes! The Voice Agent API provides a complete voice-to-voice pipeline through a single WebSocket connection. It combines AssemblyAI’s speech-to-text, LLM reasoning, and text-to-speech into one integrated service — you stream audio in and receive spoken audio back in real time.
The Voice Agent API is billed at a single all-in rate of $4.50/hr covering STT, LLM reasoning, and TTS. See the Voice Agent API documentation to get started.
AssemblyAI does not offer standalone text-to-speech as a separate service. TTS is available as part of the Voice Agent API pipeline.