Insights & Use Cases
May 19, 2026

Build a voice agent for telehealth triage

Telehealth triage voice agent tutorial: build real-time AI calls with OPQRST symptom capture, severity scoring, care-level routing, and BAA-backed PHI controls

Kelsey Foster
Growth
Reviewed by
No items found.
Table of contents

Build a voice agent for telehealth triage

A telehealth triage voice agent answers a patient's call, captures symptoms in their own words, scores severity against a defined protocol, and routes the patient to the right care level — emergency, urgent care, virtual visit, or self-care guidance. It doesn't diagnose, doesn't prescribe, and doesn't decide; it triages, in the same way an experienced nurse on a phone line would, then hands off with structured notes attached.

This tutorial walks through building one on the AssemblyAI Voice Agent API with a clinical-specialty prompt and the architectural controls HIPAA requires — encrypted audio, BAA-backed deployment, PII redaction, and audit logging. We'll cover the triage protocol, symptom capture, severity scoring with tool calls, and the handoff that gets the patient to the right next step. The companion repository is linked at the end.

This is a triage agent, not a clinical decision-maker. Everything in this guide assumes a human clinician makes the final call — the voice agent's job is to capture the data, run the protocol, and route the patient.

What telehealth triage looks like as a voice agent

A triage call follows a predictable structure. The agent:

  1. Greets the patient and confirms identity (name, date of birth)
  2. Asks for the chief complaint in the patient's own words
  3. Walks through a symptom protocol (when did it start, severity, associated symptoms)
  4. Captures red-flag symptoms that escalate severity
  5. Calls a score_severity tool that runs the captured symptoms through a triage algorithm
  6. Routes the patient — ER (911), urgent care, scheduled visit, or self-care
  7. Logs structured notes to the EHR for the receiving clinician

This pattern works for telehealth voice agents because it has a defined protocol, concrete success criteria (was the patient routed correctly?), and a clear failure mode (escalate to a human nurse if anything is unclear). It's not asking the voice agent to diagnose.

Why use the Voice Agent API for telehealth triage

Three properties matter specifically for healthcare:

  • Speech accuracy on medical terminology. Patients say "metoprolol" and "lisinopril" and "I have a history of A-fib." A model that mishears any of these creates a downstream safety issue. Universal-3 Pro Streaming, the STT layer under the Voice Agent API, performs strongly on medical conversations; for post-call note generation and billing-grade documentation, AssemblyAI's Medical Mode async API is purpose-built for clinical terminology.
  • BAA-backed deployment for processing PHI. AssemblyAI enables covered entities and their business associates subject to HIPAA to use AssemblyAI services to process protected health information (PHI), and offers a Business Associate Addendum (BAA) required under HIPAA. Without a BAA you legally cannot route PHI through the service, regardless of how good the model is. Contact our sales team to execute a BAA.
  • Tool calling for protocolized triage. The triage protocol lives in tool calls — score_severity, route_to_care_level, schedule_callback, escalate_to_nurse. The agent calls tools rather than generating free-form clinical guidance, which is what keeps the system inside the bounds of triage and out of the bounds of diagnosis.

Architecture

  Patient call (PSTN via Twilio, or telehealth app)
  Voice Agent API (one WebSocket)
   ┌────────────────────────────────────┐
   │  Universal-3 Pro Streaming (STT)    │
   │     ↓                               │
   │  LLM with triage protocol           │
   │     ↓                               │
   │  TTS                                │
   └────────────────────────────────────┘
        │  tool calls
   Tool dispatcher
    - capture_symptom         (structured)
    - score_severity          (runs triage algorithm)
    - route_to_care_level     (ER / urgent / scheduled / self-care)
    - escalate_to_nurse       (live RN handoff)
    - log_to_ehr              (encrypted PHI write)

  (post-call)
   Async Medical Mode API
   - billing-grade SOAP note
   - ICD-10 candidate codes
   - quality review

The Voice Agent API runs the patient-facing conversation. The protocol logic lives in your tools. Post-call documentation goes through the async Medical Mode API for clinical-quality notes.

Before you start

You need:

  • An AssemblyAI account — for healthcare deployments, contact our sales team to execute a BAA before processing any PHI
  • A defined triage protocol from your clinical team. This guide uses a simplified version for illustration; your real protocol should come from licensed clinicians and be reviewed against ESI (Emergency Severity Index) or your organization's equivalent
  • An EHR integration target (Epic, Cerner, athena, custom)
  • A licensed RN available for live escalations

Important: Don't deploy a telehealth triage agent into production without (1) a BAA executed with AssemblyAI, (2) clinical review of every prompt and tool, (3) an always-available escalation path to a human nurse, and (4) IRB or compliance review per your organization's policies. The agent in this tutorial is a working starter — not a production-ready clinical system.

Step 1: Define the triage protocol in the system prompt

The system prompt is where the protocol lives. Three rules that make the difference between a triage agent and a chatbot:

SYSTEM_PROMPT = """You are an AI telehealth triage assistant for ACME Health.

You are NOT a doctor. You do NOT diagnose. You do NOT prescribe. Your job is
to capture symptoms, run a triage protocol, and route the patient to the
right care level. A licensed clinician makes the final decision.

CALL FLOW:
1. Greet the patient. Confirm name and date of birth.
2. Ask the chief complaint in their own words. Capture it verbatim using
   capture_symptom(category='chief_complaint', detail=...).
3. Walk through the OPQRST protocol:
   - Onset (when did it start?)
   - Provocation/Palliation (what makes it worse or better?)
   - Quality (sharp, dull, throbbing?)
   - Region/Radiation (where, does it spread?)
   - Severity (1–10)
   - Timing (constant, intermittent?)
   Call capture_symptom for each.
4. Screen for red flags relevant to the complaint:
   - Chest pain / shortness of breath / arm pain → cardiac red flags
   - Severe headache / vision changes / weakness → stroke red flags
   - High fever / stiff neck → meningitis red flags
   - Severe abdominal pain / blood → surgical red flags
   - Suicidal ideation → mental health red flags
   If ANY red flag is present, call escalate_to_nurse IMMEDIATELY and
   say: "These symptoms need immediate attention. I'm connecting you to
   our on-call nurse right now."
5. Call score_severity with all captured symptoms.
6. Based on the result, call route_to_care_level with the recommendation.

CRITICAL RULES:
- Never tell the patient what they have. Use "your symptoms suggest..." not
  "you have...".
- Never recommend medication or dosage changes.
- If the patient asks medical questions outside triage, say:
  "I can't answer that. Let me connect you with our nurse line."
  and call escalate_to_nurse.
- If you're uncertain at any point, escalate.

STYLE:
- Speak calmly. One or two sentences per turn.
- Use plain language, not medical jargon. "Pressure in your chest" not
  "thoracic discomfort".
- Confirm critical details back: "You said the pain started Tuesday — is
  that right?"
"""

The escalate-on-uncertainty rule is the most important. A triage agent that confidently routes a heart attack to "schedule a visit" is dangerous. One that escalates to a human nurse the moment red flags appear is safe.

Step 2: Define the tools

Each tool needs "type": "function" at the top level — the Voice Agent API validates this on session.update.

TOOLS = [
    {
        "type": "function",
        "name": "capture_symptom",
        "description": "Record a symptom or piece of OPQRST data.",
        "parameters": {
            "type": "object",
            "properties": {
                "category": {
                    "type": "string",
                    "enum": ["chief_complaint", "onset", "provocation",
                             "quality", "region", "severity",
                             "timing", "red_flag"],
                },
                "detail": {"type": "string"},
            },
            "required": ["category", "detail"],
        },
    },
    {
        "type": "function",
        "name": "score_severity",
        "description": (
            "Score the patient's severity based on captured symptoms. "
            "Returns an ESI-style level (1=critical, 5=non-urgent)."
        ),
        "parameters": {
            "type": "object",
            "properties": {
                "symptoms": {"type": "array", "items": {"type": "string"}},
            },
            "required": ["symptoms"],
        },
    },
    {
        "type": "function",
        "name": "route_to_care_level",
        "description": "Route the patient to the appropriate care level.",
        "parameters": {
            "type": "object",
            "properties": {
                "level": {
                    "type": "string",
                    "enum": ["emergency", "urgent_care", "scheduled_visit",
                             "self_care"],
                },
                "reason": {"type": "string"},
            },
            "required": ["level", "reason"],
        },
    },
    {
        "type": "function",
        "name": "escalate_to_nurse",
        "description": (
            "Connect the patient to a live registered nurse immediately. "
            "Call this for any red-flag symptom or any time the protocol "
            "is unclear."
        ),
        "parameters": {
            "type": "object",
            "properties": {"reason": {"type": "string"}},
            "required": ["reason"],
        },
    },
    {
        "type": "function",
        "name": "log_to_ehr",
        "description": "Write structured triage notes to the EHR.",
        "parameters": {
            "type": "object",
            "properties": {
                "patient_id": {"type": "string"},
                "symptoms": {"type": "object"},
                "severity": {"type": "integer"},
                "disposition": {"type": "string"},
            },
            "required": ["patient_id", "symptoms", "severity", "disposition"],
        },
    },
]

The score_severity tool is where your clinical algorithm lives. In the repo, it's a simple rule-based scorer for demonstration; in production, this is the function your clinical team reviews and signs off on.

Step 3: Severity scoring logic

RED_FLAG_KEYWORDS = {
    "cardiac": ["chest pain", "pressure", "tight", "shortness of breath",
                "arm pain", "jaw pain", "sweating"],
    "stroke":  ["face drooping", "weakness", "slurred speech", "vision",
                "confusion"],
    "surgical":["severe abdominal", "blood in stool", "vomiting blood",
                "rigid abdomen"],
    "sepsis":  ["high fever", "stiff neck", "altered mental"],
    "mental":  ["suicidal", "self-harm", "kill myself"],
}

def score_severity(symptoms):
    text = " ".join(s.lower() for s in symptoms)
    for category, keywords in RED_FLAG_KEYWORDS.items():
        if any(kw in text for kw in keywords):
            return {"level": 1, "category": category, "route": "emergency"}
    if any(kw in text for kw in ["severe pain", "9/10", "10/10", "can't breathe"]):
        return {"level": 2, "route": "emergency"}
    if any(kw in text for kw in ["moderate pain", "7/10", "8/10", "fever 101", "fever 102"]):
        return {"level": 3, "route": "urgent_care"}
    if any(kw in text for kw in ["mild pain", "5/10", "6/10"]):
        return {"level": 4, "route": "scheduled_visit"}
    return {"level": 5, "route": "self_care"}

This is illustrative only. Real telehealth triage uses validated scoring (ESI, AMTS, organization-specific protocols) developed and reviewed by clinical staff. Don't ship anything to production without that review.

Step 4: Audit logging and PHI controls

Every transcript event from the Voice Agent API is PHI. Treat it as such:

  • Encrypt at rest. Use envelope encryption (KMS) for any persisted audio or transcripts.
  • Encrypt in transit. The Voice Agent API WebSocket is TLS — no additional work there.
  • Audit log every access. Who read which call, when, from where.
  • Apply PII redaction to anything that leaves your VPC. Phone numbers, addresses, SSNs, names should be redacted before transcripts hit analytics warehouses or training pipelines.
  • Set retention policies. Most healthcare orgs retain triage call transcripts for 7 years; configure your storage accordingly.

The Voice Agent API's events (transcript.user, transcript.agent, tool.call, tool.result) are exactly what you'd write to the EHR. Build the log_to_ehr tool to flush a structured record at the end of every call.

Step 5: Test against representative cases

Before any patient calls the agent, run it against a clinical test suite:

Case Expected route
"I have crushing chest pain and my left arm is numb" emergency (cardiac red flag)
"I have a fever of 102 and a stiff neck" emergency (sepsis red flag)
"I sprained my ankle yesterday, pain is 5 out of 10" urgent_care or scheduled_visit
"I have a runny nose and slight cough for two days" self_care
"I'm having thoughts of hurting myself" escalate_to_nurse (mental health red flag)

Run at least 200 cases through the agent with clinician review of every disposition. The cost of a missed escalation is a clinical safety event; the cost of an over-escalation is overuse of the nurse line. Tune until both are within your organization's tolerance.

Step 6: Post-call documentation with Medical Mode

After the call, run the captured audio through AssemblyAI's Medical Mode async API for billing-grade clinical documentation. Enable it with the domain="medical-v1" parameter on a standard pre-recorded transcript request:

import assemblyai as aai

aai.settings.api_key = "YOUR_API_KEY"

config = aai.TranscriptionConfig(
    speech_models=["universal-3-pro", "universal-2"],
    domain="medical-v1",       # enables Medical Mode
    speaker_labels=True,        # provider/patient separation
    keyterms_prompt=["Lispro", "Humalog", "metoprolol"],
)
transcript = aai.Transcriber().transcribe(call_audio_url, config)
# Then send transcript.text through the LLM Gateway for SOAP generation.

Medical Mode is purpose-built for medication names, procedures, conditions, and dosages — it's billed as a separate add-on (see pricing). Combine it with LLM Gateway SOAP generation to produce structured chart entries from the transcript.

The complete repository

Fork the runnable repo at github.com/kelsey-aai/telehealth-triage-voice-agent. It includes the triage agent loop, the OPQRST protocol prompt, the red-flag scorer, the routing logic, and a sample EHR adapter stub. Around 350 lines of Python.

Frequently asked questions

How do I build a voice agent for telehealth triage?

To build a voice agent for telehealth triage, open an AssemblyAI Voice Agent API session with a clinical-specialty system prompt that walks the patient through an OPQRST symptom protocol, screens for red flags, and routes via tool calls. The agent should never diagnose or prescribe — it captures symptoms with capture_symptom, scores severity with score_severity (your clinical algorithm), routes via route_to_care_level, and escalates to a live RN through escalate_to_nurse whenever red flags appear or the protocol is unclear. All of this runs inside one WebSocket at wss://agents.assemblyai.com/v1/ws, with audit logging, encrypted transcripts, and a BAA executed with AssemblyAI before any PHI is processed.

Can I use the Voice Agent API for healthcare workflows subject to HIPAA?

AssemblyAI is considered a business associate under HIPAA and offers a standard Business Associate Addendum (BAA) for customers processing PHI. Before processing any PHI you need to execute the BAA with AssemblyAI — contact our sales team. The Voice Agent API uses TLS for transit, supports PII redaction, and provides per-session audit logs. Your application also needs its own architecture aligned to HIPAA — encryption at rest, role-based access controls, audit logging, retention policies — to meet your obligations end-to-end.

Can a telehealth voice agent diagnose patients?

No. A telehealth triage voice agent should never diagnose, prescribe, or provide clinical decisions. Its role is to capture symptoms, run a defined triage protocol developed by licensed clinicians, score severity, and route the patient to the appropriate care level — emergency, urgent care, scheduled visit, or self-care. A human clinician (nurse, physician, NP) makes the final clinical decision. The system prompt should explicitly forbid diagnostic statements ("you have..." — never; "your symptoms suggest..." — only when leading into a routing decision).

How does the Voice Agent API handle medical terminology?

The STT layer under the Voice Agent API is Universal-3 Pro Streaming, which performs well on conversational medical terminology like medication names and common conditions. For billing-grade clinical documentation — SOAP notes, ICD-10 candidate coding, structured chart entries — AssemblyAI's separate Medical Mode async API is purpose-built for clinical accuracy. Enable it with domain="medical-v1" on a pre-recorded transcript request. The common architecture is: real-time triage on the Voice Agent API, post-call documentation through Medical Mode async, both under the same BAA.

What happens when the agent encounters a red flag?

When the agent detects a red flag — cardiac symptoms (chest pain, arm pain, shortness of breath), stroke symptoms (facial drooping, slurred speech, weakness), surgical symptoms (severe abdominal pain), sepsis indicators (high fever with stiff neck), or mental health emergencies (suicidal ideation) — it should immediately call escalate_to_nurse with the reason, tell the patient "These symptoms need immediate attention. I'm connecting you to our on-call nurse right now," and hand off the call along with the captured symptoms. Red-flag escalation must be automatic, not conditional. Never let the agent continue triaging after a red flag is captured.

What's the difference between this and a healthcare scheduling voice agent?

A healthcare scheduling voice agent books appointments, verifies insurance, and handles prescription refills — administrative tasks where the worst-case error is a rescheduled appointment. A telehealth triage voice agent captures clinical symptoms and routes to care levels — clinical tasks where the worst-case error is a missed cardiac event. The two have different risk profiles, different prompts, different tools, and different review processes. A team building both should keep them as separate agents with separate audit trails. Our healthcare voice agents guide covers the scheduling/administrative side.

Title goes here

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Button Text
AI voice agents
Medical
Healthcare