Popular

Decoding Strategies: How LLMs Choose The Next Word
Decoding Strategies: How LLMs Choose The Next Word

Large Language Models are trained to guess the next word. But when generating text, the combination of their probability estimates with algorithms known as decoding strategies is what determines how they actually choose words. Learn how decoding strategies work in this article.

AI trends in 2024: Graph Neural Networks
AI trends in 2024: Graph Neural Networks

From fundamental research to productionized AI models, let’s discover how this cutting-edge technology is powering production applications and may be shaping the future of AI.

AI for Universal Audio Understanding: Qwen-Audio Explained
AI for Universal Audio Understanding: Qwen-Audio Explained

Recently, researchers have made progress towards universal audio understanding, marking an advancement towards foundational audio models. The approach is based on a joint audio-language pre-training that enhances performance without task-specific finetuning.

Combining Speech Recognition and Diarization in one model
Combining Speech Recognition and Diarization in one model

A new approach towards multi-speaker speech processing integrates Speaker Diarization and Automatic Speech Recognition in a unified framework. We discuss the key insights from this recent exciting development in Speech AI research.

How DALL-E 2 Actually Works
How DALL-E 2 Actually Works

How does OpenAI's groundbreaking DALL-E 2 model actually work? Check out this detailed guide to learn the ins and outs of DALL-E 2.

What AI Music Generators Can Do (And How They Do It)
What AI Music Generators Can Do (And How They Do It)

Text-to-Music Models are advancing rapidly with the recent release of new platforms for AI-generated music. This guide focuses on MusicLM, MusicGen, and Stable Audio, exploring the technical breakthroughs and challenges in creating music with AI.

Residual Vector Quantization RVQ for Neural Compression
What is Residual Vector Quantization?

Neural Audio Compression methods based on Residual Vector Quantization are reshaping the landscape of modern audio codecs. In this guide, learn the basic ideas behind RVQ and how it enhances Neural Compression.

How RLHF Models Works - Reinforcement Learning From Human Feedback
How RLHF Preference Model Tuning Works (And How Things May Go Wrong)

Large Language Models like ChatGPT are trained with Reinforcement Learning From Human Feedback (RLHF) to learn human preferences. Let’s uncover how RLHF works and survey its current strongest limitations.

How Reinforcement Learning from AI Feedback works
How Reinforcement Learning from AI Feedback works

Reinforcement Learning from AI Feedback (RLAIF) is a supervision technique that uses a "constitution" to make AI assistants like ChatGPT safer. Learn everything you need to know about RLAIF in this guide.

Recent developments in Generative AI for Audio
Recent developments in Generative AI for Audio

The spotlight has been on language and images for Generative AI, but there's been a lot of recent progress in the audio domain. Learn everything you need to know about generative audio models in this article.

Introduction to Large Language Models for Generative AI
Introduction to Large Language Models for Generative AI

Generative AI language models like ChatGPT are changing the way humans and AI interact and work together, but how do these models actually work? Learn everything you need to know about modern Generative AI for language in this simple guide.

The Full Story of Large Language Models and RLHF
The Full Story of Large Language Models and RLHF

Large Language Models have been in the limelight since the release of ChatGPT, with new models being announced seemingly every week. This guide walks through the essential ideas of how these models came to be.

Introduction to Generative AI
Introduction to Generative AI

Generative AI has made tremendous strides recently, from models like Stable Diffusion to ChatGPT. Get up to speed on the latest advancements with this easy-to-follow introduction to Generative AI.

Conformer-1: A robust speech recognition model trained on 650K hours of data
Conformer-1: A robust speech recognition model trained on 650K hours of data

We’ve trained a Conformer speech recognition model on 650K hours of audio data. The new model, Conformer-1, approaches human-level performance for speech recognition, reaching a new state-of-the-art on real-world audio data.

Transcribe audio or video files right from your terminal
Transcribe audio or video files right from your terminal

We built the AssemblyAI CLI to help developers quickly test our latest models, right from your terminal, with minimal installation required.

New for Enterprise: Improved Accuracy, Always-on Support, and SOC 2 Type 2
New for Enterprise: Improved Accuracy, Always-on Support, and SOC 2 Type 2

Today, we’re excited to announce two new Enterprise offerings — AutoTune Early Access and Premier Support — and share the latest on our continued commitment to security.

2022 Benchmark Report
2022 Benchmark Report

In this benchmark report, we compare our new v8 model architecture transcription accuracy between AssemblyAI, Google Cloud Speech-to-Text, and AWS Transcribe on a variety of audio use cases.

Coming Soon in Fall 2022 at AssemblyAI
Coming Soon in Fall 2022 at AssemblyAI

Boosted by our recent $30M Series B announcement, product velocity at AssemblyAI has been accelerating faster than ever before. Now, we’re thrilled to announce a slew of model updates and new services on the horizon for fall 2022.

Our $30M Series B
Our $30M Series B

Today, we’re excited to share that we’ve raised another $30M in our Series B round led by global software investor Insight Partners.

How Imagen Actually Works
How Imagen Actually Works

Given a brief description of a scene, Imagen can generate photorealistic, high-resolution images of the scene. Learn everything you need to know about Imagen and how it works in this easy-to-follow guide.

Introduction to Diffusion Models for Machine Learning
Introduction to Diffusion Models for Machine Learning

The meteoric rise of Diffusion Models is one of the biggest developments in Machine Learning in the past several years. Learn everything you need to know about Diffusion Models in this easy-to-follow guide.