Summarization - AssemblyAI

Supported regions

US & EU

Generate summaries of your audio transcripts split into chapters with timestamps.

Summarization is in open beta. Provide us feedback if results are not as good as expected.

Quickstart

Single call in async transcription:

Python
JavaScript

import requests
import time

base_url = "https://api.assemblyai.com"
headers = {"authorization": "<YOUR_API_KEY>"}

audio_url = "https://assembly.ai/wildfires.mp3"

data = {
    "audio_url": audio_url,
    "speech_understanding": {
        "request": {
            "summarization": {
                "summary_type": "paragraph"
            }
        }
    }
}

response = requests.post(base_url + "/v2/transcript", json=data, headers=headers)
transcript_id = response.json()['id']
polling_endpoint = base_url + "/v2/transcript/" + transcript_id

while True:
    transcription_result = requests.get(polling_endpoint, headers=headers).json()
    if transcription_result['status'] == 'completed':
        break
    elif transcription_result['status'] == 'error':
        raise RuntimeError(f"Transcription failed: {transcription_result['error']}")
    else:
        time.sleep(3)

print(transcription_result["speech_understanding"]["response"]["summarization"])

const baseUrl = "https://api.assemblyai.com";

const headers = {
  authorization: "<YOUR_API_KEY>",
  "content-type": "application/json",
};

// Step 1: Transcribe your audio file
const audioUrl = "https://assembly.ai/wildfires.mp3";

const data = {
  audio_url: audioUrl,
  speech_understanding: {
    request: {
      summarization: {
        summary_type: "paragraph"
      }
    }
  }
};

const response = await fetch(`${baseUrl}/v2/transcript`, {
  method: "POST",
  headers,
  body: JSON.stringify(data),
});

const { id: transcriptId } = await response.json();
const pollingEndpoint = `${baseUrl}/v2/transcript/${transcriptId}`;

let transcriptionResult;
while (true) {
  const pollingResponse = await fetch(pollingEndpoint, { headers });
  transcriptionResult = await pollingResponse.json();

  if (transcriptionResult.status === "completed") {
    break;
  } else if (transcriptionResult.status === "error") {
    throw new Error(`Transcription failed: ${transcriptionResult.error}`);
  } else {
    await new Promise((resolve) => setTimeout(resolve, 3000));
  }
}

console.log(transcriptionResult.speech_understanding.response.summarization)

Example output

{
    "status": "success",
    "summary": [
      {
        "start": 240,
        "end": 37100,
        "text": "Smoke from hundreds of Canadian wildfires is causing air quality alerts across the US, turning skylines gray and prompting warnings to stay inside.",
        "headline": "Smoke from Canadian Wildfires Affects US Air Quality"
      },
      {
        "start": 39100,
        "end": 60670,
        "text": "Professor Peter DeCarlo explains that dry conditions and specific weather systems are channeling smoke from Canadian wildfires into the Mid-Atlantic and Northeast regions of the US.",
        "headline": "Weather Systems Channeling Smoke"
      },
      {
        "start": 62350,
        "end": 92190,
        "text": "The harmful component of the haze is particulate matter, microscopic particles smaller than a hair's width that can penetrate the lungs and impact the respiratory, cardiovascular, and neurological systems.",
        "headline": "Particulate Matter Health Risks"
      },
      {
        "start": 93550,
        "end": 135990,
        "text": "Current particulate matter concentrations are extremely high, reaching 150 micrograms per meter cubed, which is over 10 times the annual average and four times the 24-hour limit, posing significant health risks.",
        "headline": "Extreme Concentration Levels"
      },
      {
        "start": 137610,
        "end": 170140,
        "text": "Vulnerable groups include children, the elderly, and individuals with pre-existing respiratory or heart conditions, as their bodies are less able to cope with high levels of air pollution.",
        "headline": "Vulnerable Populations"
      },
      {
        "start": 171020,
        "end": 193260,
        "text": "While some areas like New York currently face higher concentrations, the smoke is expected to shift as weather patterns change, moving the highest levels away from the current regions over the next few days.",
        "headline": "Forecast for Smoke Dispersion"
      },
      {
        "start": 198280,
        "end": 239850,
        "text": "The duration of the smoke impact depends on weather system changes; as the current systems shift, the smoke will be pushed elsewhere, ending the current impact on the region.",
        "headline": "Duration of Impact"
      },
      {
        "start": 241370,
        "end": 280530,
        "text": "Climate change is predicted to lead to longer and more frequent wildfire seasons, resulting in more widespread air quality consequences, particularly affecting the eastern US more often in the future.",
        "headline": "Future Implications of Climate Change"
      }
    ],
    "summary_type": "paragraph",
    "effort": "low"
}

Customize your summary

You can control the summary output by adjusting the params.

Bullets vs Paragrahs

You can pick between two different summary_type params of bullets and paragraph. Bullets will get you short concise bullet point style chapter summaries. Meanwhile, paragraph will generally give you longer more detailed summaries.

Effort

The effort param lets you determine whether to spend more processing power to get higher quality summaries. Currently there are two options: low which is the default, and medium. Summaries are not a data intensive task, nor do they typically need utmost accuracy. For the majority of use cases leaving effort at low/the default will be all you need.

Example

Python

import requests
import time

base_url = "https://api.assemblyai.com"

headers = {
    "authorization": "<YOUR_API_KEY>"
}

# Need to transcribe a local file? Learn more here: https://www.assemblyai.com/docs/getting-started/transcribe-an-audio-file

upload_url = "https://assembly.ai/wildfires.mp3"

# Configure transcript with speaker identification

data = {
    "audio_url": upload_url,
    "language_detection": True,
    "speaker_labels": True,
    "speech_understanding": {
        "request": {
            "summarization": {
                "summary_type": "bullets"
                "effort": "medium"
            }
        }
    }
}

# Submit the transcription request

response = requests.post(base_url + "/v2/transcript", headers=headers, json=data)
# ... fetch results in GET

When to use medium

Sometimes you have a use case where you need to make sure specific details are captured, or you have more challenging topics to summarize. In these cases medium effort will more intelligently parse out important details. Examples:

important meetings where missed details are significant
multi-lingual texts where it’s important to capture all languages
very long audio, generally 1.5h and longer

​Quickstart

​Example output

​Customize your summary

​Bullets vs Paragrahs

​Effort

​Example

​When to use medium

Quickstart

Example output

Customize your summary

Bullets vs Paragrahs

Effort

Example

When to use medium