Popular - News, Tutorials, AI Research

Case Studies

Nov 19, 2024

How we built our AI Lakehouse

Learn how we built our AI data Lakehouse to allow for rapid research iteration while maintaining cohesive, secure, and deduplicated datasets.

Ahmed Etefy, Ryan O'Connor

Tech Lead - Data Infrastructure, Senior Developer Educator

Decoding Strategies: How LLMs Choose The Next Word

Deep Learning

Aug 21, 2024

Decoding Strategies: How LLMs Choose The Next Word

Large Language Models are trained to guess the next word. But when generating text, the combination of their probability estimates with algorithms known as decoding strategies is what determines how they actually choose words. Learn how decoding strategies work in this article.

Marco Ramponi

Developer Educator

AI trends in 2024: Graph Neural Networks

Deep Learning

Feb 20, 2024

AI trends in 2024: Graph Neural Networks

From fundamental research to productionized AI models, let’s discover how this cutting-edge technology is powering production applications and may be shaping the future of AI.

Marco Ramponi

Developer Educator

AI for Universal Audio Understanding: Qwen-Audio Explained

Deep Learning

Dec 7, 2023

AI for Universal Audio Understanding: Qwen-Audio Explained

Recently, researchers have made progress towards universal audio understanding, marking an advancement towards foundational audio models. The approach is based on a joint audio-language pre-training that enhances performance without task-specific finetuning.

Marco Ramponi

Developer Educator

Combining Speech Recognition and Diarization in one model

Deep Learning

Oct 27, 2023

Combining Speech Recognition and Diarization in one model

A new approach towards multi-speaker speech processing integrates Speaker Diarization and Automatic Speech Recognition in a unified framework. We discuss the key insights from this recent exciting development in Speech AI research.

Marco Ramponi

Developer Educator

Deep Learning

Sep 29, 2023

How DALL-E 2 Actually Works

How does OpenAI's groundbreaking DALL-E 2 model actually work? Check out this detailed guide to learn the ins and outs of DALL-E 2.

Ryan O'Connor

Senior Developer Educator

What AI Music Generators Can Do (And How They Do It)

Deep Learning

Sep 22, 2023

What AI Music Generators Can Do (And How They Do It)

Text-to-Music Models are advancing rapidly with the recent release of new platforms for AI-generated music. This guide focuses on MusicLM, MusicGen, and Stable Audio, exploring the technical breakthroughs and challenges in creating music with AI.

Marco Ramponi

Developer Educator

Residual Vector Quantization RVQ for Neural Compression

Deep Learning

Sep 4, 2023

What is Residual Vector Quantization?

Neural Audio Compression methods based on Residual Vector Quantization are reshaping the landscape of modern audio codecs. In this guide, learn the basic ideas behind RVQ and how it enhances Neural Compression.

Marco Ramponi

Developer Educator

How RLHF Models Works - Reinforcement Learning From Human Feedback

Deep Learning

Aug 3, 2023

How RLHF Preference Model Tuning Works (And How Things May Go Wrong)

Large Language Models like ChatGPT are trained with Reinforcement Learning From Human Feedback (RLHF) to learn human preferences. Let’s uncover how RLHF works and survey its current strongest limitations.

Marco Ramponi

Developer Educator

How Reinforcement Learning from AI Feedback works

Deep Learning

Aug 1, 2023

How Reinforcement Learning from AI Feedback works

Reinforcement Learning from AI Feedback (RLAIF) is a supervision technique that uses a "constitution" to make AI assistants like ChatGPT safer. Learn everything you need to know about RLAIF in this guide.

Ryan O'Connor

Senior Developer Educator

Recent developments in Generative AI for Audio

Deep Learning

Jun 27, 2023

Recent developments in Generative AI for Audio

The spotlight has been on language and images for Generative AI, but there's been a lot of recent progress in the audio domain. Learn everything you need to know about generative audio models in this article.

Marco Ramponi

Developer Educator

Introduction to Large Language Models for Generative AI

Popular

May 17, 2023

Introduction to Large Language Models for Generative AI

Generative AI language models like ChatGPT are changing the way humans and AI interact and work together, but how do these models actually work? Learn everything you need to know about modern Generative AI for language in this simple guide.

Ryan O'Connor

Senior Developer Educator

The Full Story of Large Language Models and RLHF

Deep Learning

May 3, 2023

The Full Story of Large Language Models and RLHF

Large Language Models have been in the limelight since the release of ChatGPT, with new models being announced seemingly every week. This guide walks through the essential ideas of how these models came to be.

Marco Ramponi

Developer Educator

Deep Learning

May 2, 2023

Introduction to Generative AI

Generative AI has made tremendous strides recently, from models like Stable Diffusion to ChatGPT. Get up to speed on the latest advancements with this easy-to-follow introduction to Generative AI.

Ryan O'Connor

Senior Developer Educator

Conformer-1: A robust speech recognition model trained on 650K hours of data

Announcements

Mar 15, 2023

Conformer-1: A robust speech recognition model trained on 650K hours of data

We’ve trained a Conformer speech recognition model on 650K hours of audio data. The new model, Conformer-1, approaches human-level performance for speech recognition, reaching a new state-of-the-art on real-world audio data.

Marco Ramponi

Developer Educator

Transcribe audio or video files right from your terminal

Announcements

Oct 19, 2022

Transcribe audio or video files right from your terminal

We built the AssemblyAI CLI to help developers quickly test our latest models, right from your terminal, with minimal installation required.

Francisco Castillo

Software Engineer at AssemblyAI

New for Enterprise: Improved Accuracy, Always-on Support, and SOC 2 Type 2

Announcements

Sep 6, 2022

New for Enterprise: Improved Accuracy, Always-on Support, and SOC 2 Type 2

Today, we’re excited to announce two new Enterprise offerings — AutoTune Early Access and Premier Support — and share the latest on our continued commitment to security.

Micky Teng

Head of Marketing

Announcements

Sep 2, 2022

2022 Benchmark Report

In this benchmark report, we compare our new v8 model architecture transcription accuracy between AssemblyAI, Google Cloud Speech-to-Text, and AWS Transcribe on a variety of audio use cases.

Lee Vaughn

API Support Engineer

Announcements

Sep 1, 2022

Coming Soon in Fall 2022 at AssemblyAI

Boosted by our recent $30M Series B announcement, product velocity at AssemblyAI has been accelerating faster than ever before. Now, we’re thrilled to announce a slew of model updates and new services on the horizon for fall 2022.

Kelsey Foster

Growth

Announcements

Jul 14, 2022

Our $30M Series B

Today, we’re excited to share that we’ve raised another $30M in our Series B round led by global software investor Insight Partners.

Dylan Fox

Founder, CEO

Deep Learning

Jun 23, 2022

How Imagen Actually Works

Given a brief description of a scene, Imagen can generate photorealistic, high-resolution images of the scene. Learn everything you need to know about Imagen and how it works in this easy-to-follow guide.

Ryan O'Connor

Senior Developer Educator

Introduction to Diffusion Models for Machine Learning

Deep Learning

May 12, 2022

Introduction to Diffusion Models for Machine Learning

The meteoric rise of Diffusion Models is one of the biggest developments in Machine Learning in the past several years. Learn everything you need to know about Diffusion Models in this easy-to-follow guide.

Ryan O'Connor

Senior Developer Educator