Audio Intelligence
Auto Chapters
The Auto Chapters model summarizes audio data over time into chapters. Chapters makes it easy for users to navigate and find specific information.
Each chapter contains the following:
- Summary
- One-line gist
- Headline
- Start and end timestamps
Auto Chapters and Summarization
You can only enable one of the Auto Chapters and Summarization models in the same transcription.
Quickstart
Enable Auto Chapters by setting auto_chapters
to true
in the transcription config. punctuate
must be enabled to use Auto Chapters (punctuate
is enabled by default).
Example output
Auto Chapters Using LeMUR
Check out this cookbook Creating Chapter Summaries for an example of how to leverage LeMUR’s custom text input parameter for chapter summaries.
For the full API reference, see the API reference section on the Auto Chapters page.
Content Moderation
The Content Moderation model lets you detect inappropriate content in audio files to ensure that your content is safe for all audiences.
The model pinpoints sensitive discussions in spoken data and their severity.
Quickstart
Enable Content Moderation by setting content_safety
to true
in the transcription config.
Example output
Adjust the confidence threshold
The confidence threshold determines how likely something is to be flagged as inappropriate content. A threshold of 50% (which is the default) means any label with a confidence score of 50% or greater is flagged.
To adjust the confidence threshold for your transcription, include content_safety_confidence
in the transcription config.
For the full API reference, as well as the supported labels and FAQs, refer to the full Content Moderation page.
Entity Detection
The Entity Detection model lets you automatically identify and categorize key information in transcribed audio content.
Here are a few examples of what you can detect:
- Names of people
- Organizations
- Addresses
- Phone numbers
- Medical data
- Social security numbers
For the full list of entities that you can detect, see Supported entities.
Supported languages
Entity Detection is available in multiple languages. See Supported languages.
Quickstart
Enable Entity Detection by setting entity_detection
to true
in the transcription config.
Example output
For the full API reference, as well as the supported entities and FAQs, refer to the full Entity Detection page.
Key Phrases
The Key Phrases model identifies significant words and phrases in your transcript and lets you extract the most important concepts or highlights from your audio or video file.
Quickstart
Enable Key Phrases by setting auto_highlights
to true
in the transcription config.
Example output
For the full API reference and FAQs, refer to the full Key Phrases page.
PII Redaction
The PII Redaction model lets you minimize sensitive information about individuals by automatically identifying and removing it from your transcript.
Personal Identifiable Information (PII) is any information that can be used to identify a person, such as a name, email address, or phone number.
When you enable the PII Redaction model, your transcript will look like this:
- With
hash
substitution:Hi, my name is ####!
- With
entity_name
substitution:Hi, my name is [PERSON_NAME]!
You can also Create redacted audio files to replace sensitive information with a beeping sound.
Supported languages
PII Redaction is available in multiple languages. See Supported languages.
Redacted properties
PII only redacts words in the text
property. Properties from other features
may still include PII, such as entities
from Entity
Detection or summary
from
Summarization.
Quickstart
Enable PII Redaction on the TranscriptionConfig
using the set_redact_pii()
method.
Set policies
to specify the information you want to redact. For the full list of policies, see PII policies.
Example output
Create redacted audio files
In addition to redacting sensitive information from the transcription text, you can also generate a version of the original audio file with the PII “beeped” out.
Enable Sentiment Analysis by setting sentiment_analysis
to true
in the transcription config.
Example output
Sentiment Analysis Using LeMUR
Check out this cookbook LeMUR for Customer Call Sentiment Analysis for an example of how to leverage LeMUR’s QA feature for sentiment analysis.
Add speaker labels to sentiments
To add speaker labels to each sentiment analysis result, using Speaker Diarization, enable speaker_labels
in the transcription config.
Each sentiment result will then have a speaker
field that contains the speaker label.
For the full API reference and FAQs, refer to the full Sentiment Analysis page.
Summarization
Distill important information by summarizing your audio files.
The Summarization model generates a summary of the resulting transcript. You can control the style and format of the summary using Summary models and Summary types.
Summarization and Auto Chapters
You can only enable one of the Summarization and Auto Chapters models in the same transcription.
Quickstart
Enable Summarization by setting summarization
to true
in the transcription config. Use summary_model
and summary_type
to change the summary format.
If you specify one of summary_model
and summary_type
, then you must specify the other.
The following example returns an informative summary in a bulleted list.
Example output
Custom Summaries Using LeMUR
If you want more control of the output format, see how to generate a Custom summary using LeMUR.
For the full API reference, as well as the supported summary models/types and FAQs, refer to the full Summarization page.
Topic Detection
The Topic Detection model lets you identify different topics in the transcript. The model uses the IAB Content Taxonomy, a standardized language for content description which consists of 698 comprehensive topics.
Quickstart
Enable Topic Detection by setting iab_categories
to true
in the transcription parameters.
Example output
Topic Detection Using LeMUR
Check out this cookbook Custom Topic Tags for an example of how to leverage LeMUR for custom topic detection.
For the full API reference, as well as the full list of supported topics and FAQs, refer to the full Topic Detection page.