Ruby Audio Intelligence

Auto Chapters

Enable Auto Chapters by setting auto_chapters to true in the transcription config. punctuate must be enabled to use Auto Chapters (punctuate is enabled by default).

1require 'assemblyai'
2
3client = AssemblyAI::Client.new(api_key: '<YOUR_API_KEY>')
4
5# For local files see our Getting Started guides.
6audio_url = 'https://assembly.ai/wildfires.mp3'
7
8transcript = client.transcripts.transcribe(
9 audio_url: audio_url,
10 auto_chapters: true
11)
12
13transcript.chapters.each do |chapter|
14 printf(
15 '%<start>d-%<end>d: %<headline>s',
16 start: chapter.start,
17 end: chapter.end_,
18 headline: chapter.headline
19 )
20end

Example output

1250-28840: Smoke from hundreds of wildfires in Canada is triggering air quality alerts across US
229610-280340: High particulate matter in wildfire smoke can lead to serious health problems
Auto Chapters Using LeMUR

Check out this cookbook Creating Chapter Summaries for an example of how to leverage LeMUR’s custom text input parameter for chapter summaries.

For the full API reference, see the API reference section on the Auto Chapters page.

Content Moderation

The Content Moderation model lets you detect inappropriate content in audio files to ensure that your content is safe for all audiences.

The model pinpoints sensitive discussions in spoken data and their severity.

Quickstart

Enable Content Moderation by setting content_safety to true in the transcription config.

1require 'assemblyai'
2
3client = AssemblyAI::Client.new(api_key: '<YOUR_API_KEY>')
4
5# For local files see our Getting Started guides.
6audio_url = 'https://assembly.ai/wildfires.mp3'
7
8transcript = client.transcripts.transcribe(
9 audio_url: audio_url,
10 content_safety: true
11)
12
13transcript.content_safety_labels.results.each do |result|
14 puts result.text
15 printf("Timestamp: %<start>d-%<end>d\n", start: result.timestamp.start, end: result.timestamp.end_)
16
17 result.labels.each do |label|
18 printf(
19 "%<label>s - %<confidence>.16f - %<severity>.16f\n",
20 label: label.label,
21 confidence: label.confidence,
22 severity: label.severity
23 )
24 end
25 puts
26end
27
28transcript.content_safety_labels.summary.each_pair do |label, confidence|
29 printf(
30 "%<confidence>d%% confident that the audio contains %<label>s\n",
31 confidence: confidence * 100,
32 label: label
33 )
34end
35
36puts
37
38transcript.content_safety_labels.severity_score_summary.each_pair do |label, severity_confidence|
39 printf(
40 "%<confidence>d%% confident that the audio contains low-severity %<label>s\n",
41 confidence: severity_confidence.low * 100,
42 label: label
43 )
44 printf(
45 "%<confidence>d%% confident that the audio contains medium-severity %<label>s\n",
46 confidence: severity_confidence.medium * 100,
47 label: label
48 )
49 printf(
50 "%<confidence>d%% confident that the audio contains high-severity %<label>s\n",
51 confidence: severity_confidence.high * 100,
52 label: label
53 )
54end

Example output

1Smoke from hundreds of wildfires in Canada is triggering air quality alerts throughout the US. Skylines...
2Timestamp: 250 - 28920
3disasters - 0.8141 - 0.4014
4
5So what is it about the conditions right now that have caused this round of wildfires to...
6Timestamp: 29290 - 56190
7disasters - 0.9217 - 0.5665
8
9So what is it in this haze that makes it harmful? And I'm assuming it is...
10Timestamp: 56340 - 88034
11health_issues - 0.9358 - 0.8906
12
13...
14
1599.42% confident that the audio contains disasters
1692.70% confident that the audio contains health_issues
17
1857.43% confident that the audio contains low-severity disasters
1942.56% confident that the audio contains mid-severity disasters
200.0% confident that the audio contains high-severity disasters
2123.57% confident that the audio contains low-severity health_issues
2230.22% confident that the audio contains mid-severity health_issues
2346.19% confident that the audio contains high-severity health_issues

Adjust the confidence threshold

The confidence threshold determines how likely something is to be flagged as inappropriate content. A threshold of 50% (which is the default) means any label with a confidence score of 50% or greater is flagged.

To adjust the confidence threshold for your transcription, include content_safety_confidence in the transcription config.

1transcript = client.transcripts.transcribe(
2 audio_url: audio_url,
3 content_safety: true,
4 content_safety_confidence: 60
5)

For the full API reference, as well as the supported labels and FAQs, refer to the full Content Moderation page.

Entity Detection

The Entity Detection model lets you automatically identify and categorize key information in transcribed audio content.

Here are a few examples of what you can detect:

  • Names of people
  • Organizations
  • Addresses
  • Phone numbers
  • Medical data
  • Social security numbers

For the full list of entities that you can detect, see Supported entities.

Supported languages

Entity Detection is available in multiple languages. See Supported languages.

Quickstart

Enable Entity Detection by setting entity_detection to true in the transcription config.

1require 'assemblyai'
2
3client = AssemblyAI::Client.new(api_key: '<YOUR_API_KEY>')
4
5# For local files see our Getting Started guides.
6audio_url = 'https://assembly.ai/wildfires.mp3'
7
8transcript = client.transcripts.transcribe(
9 audio_url: audio_url,
10 entity_detection: true
11)
12
13transcript.entities.each do |entity|
14 puts entity.text
15 puts entity.entity_type
16 printf("Timestamp: %<start>d - %<end>d\n\n", start: entity.start, end: entity.end_)
17end

Example output

1Canada
2location
3Timestamp: 2548 - 3130
4
5the US
6location
7Timestamp: 5498 - 6350
8
9...

For the full API reference, as well as the supported entities and FAQs, refer to the full Entity Detection page.

Key Phrases

The Key Phrases model identifies significant words and phrases in your transcript and lets you extract the most important concepts or highlights from your audio or video file.

Quickstart

Enable Key Phrases by setting auto_highlights to true in the transcription config.

1require 'assemblyai'
2
3client = AssemblyAI::Client.new(api_key: '<YOUR_API_KEY>')
4
5# For local files see our Getting Started guides.
6audio_url = 'https://assembly.ai/wildfires.mp3'
7
8transcript = client.transcripts.transcribe(
9 audio_url: audio_url,
10 auto_highlights: true
11)
12
13transcript.auto_highlights_result.results.each do |result|
14 timestamps = (result.timestamps.map do |timestamp|
15 format(
16 '[Timestamp(start=%<start>s, end=%<end>s)]',
17 start: timestamp.start,
18 end: timestamp.end_
19 )
20 end).join(', ')
21 printf(
22 "Highlight: %<text>s, Count: %<count>d, Rank %<rank>.2f, Timestamps: %<timestamp>s\n",
23 text: result.text,
24 count: result.count,
25 rank: result.rank,
26 timestamp: timestamps
27 )
28end

Example output

1Highlight: air quality alerts, Count: 1, Rank: 0.08, Timestamps: [Timestamp(start=3978, end=5114)]
2Highlight: wide ranging air quality consequences, Count: 1, Rank: 0.08, Timestamps: [Timestamp(start=235388, end=238838)]
3Highlight: more fires, Count: 1, Rank: 0.07, Timestamps: [Timestamp(start=184716, end=185186)]
4...

For the full API reference and FAQs, refer to the full Key Phrases page.

PII Redaction

The PII Redaction model lets you minimize sensitive information about individuals by automatically identifying and removing it from your transcript.

Personal Identifiable Information (PII) is any information that can be used to identify a person, such as a name, email address, or phone number.

When you enable the PII Redaction model, your transcript will look like this:

  • With hash substitution: Hi, my name is ####!
  • With entity_name substitution: Hi, my name is [PERSON_NAME]!

You can also Create redacted audio files to replace sensitive information with a beeping sound.

Supported languages

PII Redaction is available in multiple languages. See Supported languages.

Redacted properties

PII only redacts words in the text property. Properties from other features may still include PII, such as entities from Entity Detection or summary from Summarization.

Quickstart

Enable PII Redaction by setting redact_pii to true in the transcription config.

Use redact_pii_policies to specify the information you want to redact. For the full list of policies, see PII policies.

1require 'assemblyai'
2
3client = AssemblyAI::Client.new(api_key: '<YOUR_API_KEY>')
4
5# For local files see our Getting Started guides.
6audio_url = 'https://assembly.ai/wildfires.mp3'
7
8transcript = client.transcripts.transcribe(
9 audio_url: audio_url,
10 redact_pii: true,
11 redact_pii_policies: [
12 AssemblyAI::Transcripts::PiiPolicy::PERSON_NAME,
13 AssemblyAI::Transcripts::PiiPolicy::ORGANIZATION,
14 AssemblyAI::Transcripts::PiiPolicy::OCCUPATION
15 ]
16)
17
18puts transcript.text

Example output

1Smoke from hundreds of wildfires in Canada is triggering air quality alerts
2throughout the US. Skylines from Maine to Maryland to Minnesota are gray and
3smoggy. And in some places, the air quality warnings include the warning to stay
4inside. We wanted to better understand what's happening here and why, so we
5called ##### #######, an ######### ######### in the ########## ## #############
6###### ### ########### at ##### ####### ##########. Good morning, #########.
7Good morning. So what is it about the conditions right now that have caused this
8round of wildfires to affect so many people so far away? Well, there's a couple
9of things. The season has been pretty dry already, and then the fact that we're
10getting hit in the US. Is because there's a couple of weather systems that ...

Create redacted audio files

In addition to redacting sensitive information from the transcription text, you can also generate a version of the original audio file with the PII “beeped” out.

To create a redacted version of the audio file, set redact_pii_audio to true in the transcription config. Use redact_pii_audio_quality to specify the quality of the redacted audio file.

1transcript = client.transcripts.transcribe(
2 audio_url: audio_url,
3 redact_pii: true,
4 redact_pii_policies: [
5 AssemblyAI::Transcripts::PiiPolicy::PERSON_NAME,
6 AssemblyAI::Transcripts::PiiPolicy::ORGANIZATION,
7 AssemblyAI::Transcripts::PiiPolicy::OCCUPATION
8 ],
9 redact_pii_audio: true,
10 # Optional. Defaults to MP3.
11 redact_pii_audio_quality: AssemblyAI::Transcripts::RedactPiiAudioQuality::WAV
12)
13
14redaction_result = client.transcripts.get_redacted_audio(transcript_id: transcript.id)
15printf(
16 'Status: %<status>s, Redacted audio URL: %<url>s',
17 status: redaction_result.status,
18 url: redaction_result.redacted_audio_url
19)
Supported languages

You can only create redacted audio files for transcriptions in English and Spanish.

Maximum audio file size

You can only create redacted versions of audio files if the original file is smaller than 1 GB.

Example output

1https://s3.us-west-2.amazonaws.com/api.assembly.ai.usw2/redacted-audio/ac06721c-d1ea-41a7-95f7-a9463421e6b1.mp3?AWSAccessKeyId=...

For the full API reference, as well as the supported policies and FAQs, refer to the full PII Redaction page.

Sentiment Analysis

The Sentiment Analysis model detects the sentiment of each spoken sentence in the transcript text. Use Sentiment Analysis to get a detailed analysis of the positive, negative, or neutral sentiment conveyed in the audio, along with a confidence score for each result.

Quickstart

Enable Sentiment Analysis by setting sentiment_analysis to true in the transcription config.

1require 'assemblyai'
2
3client = AssemblyAI::Client.new(api_key: '<YOUR_API_KEY>')
4
5# For local files see our Getting Started guides.
6audio_url = 'https://assembly.ai/wildfires.mp3'
7
8transcript = client.transcripts.transcribe(
9 audio_url: audio_url,
10 sentiment_analysis: true
11)
12
13transcript.sentiment_analysis_results.each do |result|
14 puts result.text
15 puts result.sentiment
16 puts result.confidence
17 printf("%<start>d - %<end>d\n", start: result.start, end: result.end_)
18end

Example output

1Smoke from hundreds of wildfires in Canada is triggering air quality alerts throughout the US.
2SentimentType.negative
30.8181032538414001
4Timestamp: 250 - 6350
5...
Sentiment Analysis Using LeMUR

Check out this cookbook LeMUR for Customer Call Sentiment Analysis for an example of how to leverage LeMUR’s QA feature for sentiment analysis.

Add speaker labels to sentiments

To add speaker labels to each sentiment analysis result, using Speaker Diarization, enable speaker_labels in the transcription config.

Each sentiment result will then have a speaker field that contains the speaker label.

1transcript = client.transcripts.transcribe(
2 audio_url: audio_url,
3 sentiment_analysis: true,
4 speaker_labels: true
5)
6
7# ...
8
9transcript.sentiment_analysis_results.each do |result|
10 puts result.speaker
11end

For the full API reference and FAQs, refer to the full Sentiment Analysis page.

Summarization

Distill important information by summarizing your audio files.

The Summarization model generates a summary of the resulting transcript. You can control the style and format of the summary using Summary models and Summary types.

Summarization and Auto Chapters

You can only enable one of the Summarization and Auto Chapters models in the same transcription.

Quickstart

Enable Summarization by setting summarization to true in the transcription config. Use summary_model and summary_type to change the summary format.

If you specify one of summary_model and summary_type, then you must specify the other.

The following example returns an informative summary in a bulleted list.

1require 'assemblyai'
2
3client = AssemblyAI::Client.new(api_key: '<YOUR_API_KEY>')
4
5# For local files see our Getting Started guides.
6audio_url = 'https://assembly.ai/wildfires.mp3'
7
8transcript = client.transcripts.transcribe(
9 audio_url: audio_url,
10 summarization: true,
11 summary_model: AssemblyAI::Transcripts::SummaryModel::INFORMATIVE,
12 summary_type: AssemblyAI::Transcripts::SummaryType::BULLETS
13)
14
15puts transcript.summary

Example output

1- Smoke from hundreds of wildfires in Canada is triggering air quality alerts throughout the US. Skylines from Maine to Maryland to Minnesota are gray and smoggy. In some places, the air quality warnings include the warning to stay inside.
2- Air pollution levels in Baltimore are considered unhealthy. Exposure to high levels can lead to a host of health problems. With climate change, we are seeing more wildfires. Will we be seeing more of these kinds of wide ranging air quality consequences?
Custom Summaries Using LeMUR

If you want more control of the output format, see how to generate a Custom summary using LeMUR.

For the full API reference, as well as the supported summary models/types and FAQs, refer to the full Summarization page.

Topic Detection

The Topic Detection model lets you identify different topics in the transcript. The model uses the IAB Content Taxonomy, a standardized language for content description which consists of 698 comprehensive topics.

Quickstart

Enable Topic Detection by setting iab_categories to true in the transcription config.

1require 'assemblyai'
2
3client = AssemblyAI::Client.new(api_key: '<YOUR_API_KEY>')
4
5# For local files see our Getting Started guides.
6audio_url = 'https://assembly.ai/wildfires.mp3'
7
8transcript = client.transcripts.transcribe(
9 audio_url: audio_url,
10 iab_categories: true
11)
12
13# Get the parts of the transcript that were tagged with topics
14transcript.iab_categories_result.results.each do |result|
15 puts result.text
16 printf("Timestamp: %<start>d - %<end>d\n", start: result.timestamp.start, end: result.timestamp.end_)
17 result.labels.each do |label|
18 printf("%<label>s (%<relevance>f)\n", label: label.label, relevance: label.relevance)
19 end
20 puts
21end
22
23puts
24
25# Get a summary of all topics in the transcript
26transcript.iab_categories_result.summary.each_pair do |topic, relevance|
27 printf(
28 "Audio is %<relevance>d%% relevant to %<topic>s\n",
29 relevance: relevance * 100,
30 topic: topic
31 )
32end

Example output

1Smoke from hundreds of wildfires in Canada is triggering air quality alerts throughout the US. Skylines...
2Timestamp: 250 - 28920
3Home&Garden>IndoorEnvironmentalQuality (0.9881)
4NewsAndPolitics>Weather (0.5561)
5MedicalHealth>DiseasesAndConditions>LungAndRespiratoryHealth (0.0042)
6...
7Audio is 100.0% relevant to NewsAndPolitics>Weather
8Audio is 93.78% relevant to Home&Garden>IndoorEnvironmentalQuality
9...
Topic Detection Using LeMUR

Check out this cookbook Custom Topic Tags for an example of how to leverage LeMUR for custom topic detection.

For the full API reference, as well as the full list of supported topics and FAQs, refer to the full Topic Detection page.