Medical Mode - AssemblyAI

Medical Mode is an add-on that enhances streaming transcription accuracy for medical terminology — including medication names, procedures, conditions, and dosages. It is optimized for medical entity recognition to correct terms that other models frequently get wrong. Medical Mode can be used with all of our Real-time STT models. Enable Medical Mode by setting the domain connection parameter to "medical-v1". No other changes to your existing pipeline are required.

Quickstart

Set domain to "medical-v1" as a connection parameter when you open the WebSocket.

Python
Python SDK
Javascript
JavaScript SDK

CONNECTION_PARAMS = {
    "sample_rate": 16000,
    "speech_model": "u3-rt-pro",
    "domain": "medical-v1",
}

client.connect(
    StreamingParameters(
        sample_rate=16000,
        speech_model="u3-rt-pro",
        domain="medical-v1",
    )
)

const CONNECTION_PARAMS = {
  sample_rate: 16000,
  speech_model: "u3-rt-pro",
  domain: "medical-v1",
};

const transcriber = client.streaming.transcriber({
  sampleRate: 16_000,
  speechModel: "u3-rt-pro",
  domain: "medical-v1",
});

Example output

Without Medical Mode:

I have here insulin to be used for both prandial mealtime and sliding scale is
insulin lisprohumalog subcutaneously.

With Medical Mode, lisprohumalog is updated to Lispro (Humalog) - following the standard medical convention of writing the generic name first, with the brand name in parentheses.

I have here insulin to be used for both prandial mealtime and sliding scale is
insulin Lispro (Humalog) subcutaneously.

Use cases

Medical Mode is designed for healthcare AI applications where accurate medical terminology is critical:

Ambient clinical documentation — Capture medication names, dosages, and clinical terms correctly during live patient encounters.
Real-time medical scribes — Deliver accurate transcripts to clinicians during or immediately after a consult.
Front-office voice agents — Handle drug names, provider names, and clinic-specific terminology in scheduling calls and insurance verification.
Medical contact centers — Transcribe calls with correct medical vocabulary for downstream processing and quality assurance.

Combine with other features

Medical Mode works alongside other streaming features. You can combine it with:

Streaming Diarization to identify who said what in clinical conversations
Keyterms Prompting to further boost accuracy for specific medical terms unique to your use case

Python
Python SDK
Javascript
JavaScript SDK

CONNECTION_PARAMS = {
    "sample_rate": 16000,
    "speech_model": "u3-rt-pro",
    "domain": "medical-v1",
    "speaker_labels": "true",
    "keyterms_prompt": json.dumps(["Lisinopril", "Metformin", "Humalog"])
}

client.connect(
    StreamingParameters(
        sample_rate=16000,
        speech_model="u3-rt-pro",
        domain="medical-v1",
        speaker_labels=True,
        keyterms_prompt=["Lisinopril", "Metformin", "Humalog"],
    )
)

const CONNECTION_PARAMS = {
  sample_rate: 16000,
  speech_model: "u3-rt-pro",
  domain: "medical-v1",
  speaker_labels: true,
  keyterms_prompt: JSON.stringify(["Lisinopril", "Metformin", "Humalog"]),
};

const transcriber = client.streaming.transcriber({
  sampleRate: 16_000,
  speechModel: "u3-rt-pro",
  domain: "medical-v1",
  speakerLabels: true,
  keytermsPrompt: ["Lisinopril", "Metformin", "Humalog"],
});

Configuration for medical audio

Medical conversations — such as clinical dictation, patient encounters, and ambient scribes — have different speech patterns than typical voice agent interactions. Clinicians often pause mid-sentence to think, review a chart, or formulate a diagnosis. The default turn detection settings are optimized for fast-paced voice agent dialogues and can incorrectly fragment these natural pauses into separate turns. To prevent premature turn boundaries in medical audio, increase the silence thresholds:

const streamingConfig = {
  min_turn_silence: 800,
  max_turn_silence: 3600,
};

Parameter	Default	Recommended for Medical	Why
`min_turn_silence`	`100` ms (U3 Pro) / `400` ms (Universal Streaming)	`800` ms	Gives clinicians time to pause mid-sentence without triggering a speculative end-of-turn check.
`max_turn_silence`	`1000` ms (U3 Pro) / `1280` ms (Universal Streaming)	`3600` ms	Allows extended pauses for chart review or thinking without forcing a turn boundary.

These values match the Conservative quick start configuration on the turn detection page. You can further adjust them based on your specific workflow — for example, a real-time medical scribe may benefit from a lower max_turn_silence (around 2000 ms) than a dictation application.

Python
Python SDK
Javascript
JavaScript SDK

CONNECTION_PARAMS = {
    "sample_rate": 16000,
    "speech_model": "u3-rt-pro",
    "domain": "medical-v1",
    "min_turn_silence": 800,
    "max_turn_silence": 3600,
}

client.connect(
    StreamingParameters(
        sample_rate=16000,
        speech_model="u3-rt-pro",
        domain="medical-v1",
        min_turn_silence=800,
        max_turn_silence=3600,
    )
)

const CONNECTION_PARAMS = {
  sample_rate: 16000,
  speech_model: "u3-rt-pro",
  domain: "medical-v1",
  min_turn_silence: 800,
  max_turn_silence: 3600,
};

const transcriber = client.streaming.transcriber({
  sampleRate: 16_000,
  speechModel: "u3-rt-pro",
  domain: "medical-v1",
  minTurnSilence: 800,
  maxTurnSilence: 3600,
});

Avoid setting end_of_turn_confidence_threshold to 0If you are using a Universal Streaming model (not U3 Pro), do not set end_of_turn_confidence_threshold to 0. This completely disables semantic turn detection and forces a turn boundary at every silence, which is especially harmful for medical audio where mid-sentence pauses are common. See Turn detection for details.

HIPAA compliance

AssemblyAI offers a Business Associate Agreement (BAA) for customers who need to process Protected Health Information (PHI). AssemblyAI is SOC 2 Type 2, ISO 27001:2022, and PCI DSS v4.0 certified. Medical Mode does not change existing data handling or retention policies. For BAA setup or enterprise pricing, contact our sales team.

​Quickstart

​Example output

​Use cases

​Combine with other features

​Configuration for medical audio

​HIPAA compliance

Quickstart

Example output

Use cases

Combine with other features

Configuration for medical audio

HIPAA compliance