December 15, 2023

🚀 New Punctuation & Casing Model For Real-Time Transcription

We are excited to announce the release of our latest Punctuation and Truecasing model!

Smitha Kolan

Developer Educator

Smitha Kolan

Developer Educator

Reviewed by

No items found.

Table of contents

[Visible on live site]

🚀 New Punctuation & Casing Model For Real-Time

We recently released a significant improvement to our Punctuation and Truecasing model for asynchronous transcription.

This week we updated our real-time transcription service with the new punctuation and truecasing model, which provide the following improvements:

Question marks are properly attributed for real-time streaming.
Significant improvements in handling casing for challenging linguistic types, such as: mixed-case words (+39% F1 score), acronyms (+20% F1 score), and capital-case (+11% F1 score).
Overall 17% relative improvement on average across our test datasets for predicting upper-case letter classification.
Overall punctuation accuracy up by 11% (F1 score).
Our qualitative analysis shows that the new model is preferred on average 61% over the previous model by our human evaluators.

Try it out now in our new real-time playground.

AssemblyAI Node/JavaScript SDK v4 Released

Version 4 of our JavaScript SDK is live! The new version not only works on Node.js, but also in client-side JavaScript runtime environments including web browsers, Bun, Cloudflare Workers, and more.

We've revamped our sample applications to enable previously incompatible integrations to now use this new SDK version.

import { AssemblyAI } from 'assemblyai' const client = new AssemblyAI({ apiKey: 'YOUR_API_KEY' }) const audioUrl = 'https://storage.googleapis.com/aai-web-samples/5_common_sports_injuries.mp3' const run = async () => { const transcript = await client.transcripts.transcribe({ audio: audioUrl }) console.log(transcript.text) } run()

Fresh From Our Blog

AI for Universal Audio Understanding: Qwen-Audio Explained: Recently, researchers have made progress towards universal audio understanding, marking an advancement towards foundational audio models. The approach is based on a joint audio-language pre-training that enhances performance without task-specific fine-tuning. Read more>>

How to Create SRT Files for Videos in Python: Learn how to create SRT subtitle files for videos using Python in this easy-to-follow guide. Read more>>

Key phrase detection in audio files using Python: Learn how to identify key phrases and important words using Python and AssemblyAI. Read more>>

Our Trending YouTube Tutorials

Convert Speech to Text In Java (Basic Tutorial): Learn how to transcribe an audio file in Java using AssemblyAI's speech-to-text Java SDK.

Build AI App Prototypes Visually with No-Code (Open-source): Learn how to use Rivet to build a no-code AI app that transcribes a podcast episode, and takes your question and generates an answer using LeMUR.

Run LLMs locally - 5 Must-Know Frameworks!: Learn how to run LLMs locally including, Ollama, GPT4All, PrivateGPT, llama.cpp and LangChain.

🚀 New Punctuation & Casing Model For Real-Time Transcription

🚀 New Punctuation & Casing Model For Real-Time

AssemblyAI Node/JavaScript SDK v4 Released

Fresh From Our Blog

Our Trending YouTube Tutorials

Why Virtual Meeting Companies Should Use Speech AI

Conversation intelligence in contact centers

What is Automatic Speech Recognition? A Comprehensive Overview of ASR Technology

How Grain increased customer satisfaction by 12% after integrating AssemblyAI

🚀 New Punctuation & Casing Model For Real-Time Transcription

🚀 New Punctuation & Casing Model For Real-Time

AssemblyAI Node/JavaScript SDK v4 Released

Fresh From Our Blog

Our Trending YouTube Tutorials

Related posts

Why Virtual Meeting Companies Should Use Speech AI

Conversation intelligence in contact centers

What is Automatic Speech Recognition? A Comprehensive Overview of ASR Technology

How Grain increased customer satisfaction by 12% after integrating AssemblyAI