Skip to main content
POST
/
understanding
curl --request POST \
  --url https://llm-gateway.assemblyai.com/v1/understanding \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "transcript_id": "12345",
  "speech_understanding": {
    "request": {
      "translation": {
        "target_languages": [
          "es",
          "de"
        ],
        "formal": true,
        "match_original_utterance": true
      }
    }
  }
}
'
{
  "speech_understanding": {
    "request": {
      "translation": {
        "target_languages": [
          "es",
          "de"
        ],
        "formal": true,
        "match_original_utterance": true
      },
      "speaker_identification": {
        "speaker_type": "name",
        "speakers": [
          {
            "name": "Michel Martin",
            "description": "Hosts the program and interviews the guests",
            "company": "NPR",
            "title": "Host Morning Edition"
          },
          {
            "name": "Peter DeCarlo",
            "description": "Answers questions from the interview",
            "company": "Johns Hopkins University",
            "title": "Professor and Vice Chair of Environmental Health and Engineering"
          }
        ]
      },
      "custom_formatting": {
        "date": "mm/dd/yyyy",
        "phone_number": "(xxx)xxx-xxxx",
        "email": "username@domain.com"
      }
    },
    "response": {
      "translation": {
        "status": "completed"
      },
      "speaker_identification": {
        "mapping": {
          "A": "Michel Martin",
          "B": "Peter DeCarlo"
        },
        "status": "completed"
      },
      "custom_formatting": {
        "status": "completed",
        "mapping": {
          "2024-12-25": "12/25/2024",
          "555-1234-5678": "(555)123-45678"
        },
        "formatted_text": "Call me at (555)123-45678 on 12/25/2024",
        "formatted_utterances": [
          {
            "confidence": 0.92,
            "start": 0,
            "end": 2500,
            "text": "Hi, I'm the interviewer. Call me at (555)123-45678 on 12/25/2024",
            "speaker": "interviewer"
          },
          {
            "confidence": 0.95,
            "start": 2500,
            "end": 5000,
            "text": "Thanks! I'll reach out then.",
            "speaker": "candidate"
          }
        ]
      }
    }
  },
  "translated_texts": {
    "es": "Hola, soy el entrevistador. Llámame al cinco cinco cinco uno dos tres cuatro cinco seis siete ocho el veinticinco de diciembre de dos mil veinticuatro. ¡Gracias! Me pondré en contacto entonces.",
    "de": "Hallo, ich bin der Interviewer. Rufen Sie mich an unter fünf fünf fünf eins zwei drei vier fünf sechs sieben acht am fünfundzwanzigsten Dezember zweitausendvierundzwanzig. Danke! Ich werde mich dann melden."
  },
  "utterances": [
    {
      "confidence": 0.92,
      "start": 0,
      "end": 2500,
      "text": "Hi, I'm the interviewer. Call me at five five five one two three four five six seven eight on December twenty fifth twenty twenty four",
      "speaker": "interviewer",
      "translated_texts": {
        "es": "Hola, soy el entrevistador. Llámame al cinco cinco cinco uno dos tres cuatro cinco seis siete ocho el veinticinco de diciembre de dos mil veinticuatro",
        "de": "Hallo, ich bin der Interviewer. Rufen Sie mich an unter fünf fünf fünf eins zwei drei vier fünf sechs sieben acht am fünfundzwanzigsten Dezember zweitausendvierundzwanzig"
      }
    },
    {
      "confidence": 0.95,
      "start": 2500,
      "end": 5000,
      "text": "Thanks! I'll reach out then.",
      "speaker": "candidate",
      "translated_texts": {
        "es": "¡Gracias! Me pondré en contacto entonces.",
        "de": "Danke! Ich werde mich dann melden."
      }
    }
  ],
  "words": []
}

Documentation Index

Fetch the complete documentation index at: https://assemblyai.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

Authorization
string
header
required

Body

application/json

Request body for speech understanding tasks.

transcript_id
string
required

The ID of the transcript to process.

speech_understanding
object
required

The speech understanding task to perform. Supports Translation, Speaker Identification, and Custom Formatting. Click into the request object below to see the available options.

Response

Successful response containing the speech understanding results.

speech_understanding
object
translated_texts
object

Translated text keyed by language code (e.g., {"es": "Texto traducido"})

utterances
object[]

Array of utterances with translations (when match_original_utterance is true)

words
object[]