Announcements

Announcing our $50M Series C to build superhuman Speech AI models

We're excited to share that we’ve raised $50M in Series C funding led by Accel, our partners that also led our Series A, with participation from Keith Block and Smith Point Capital, Insight Partners, Daniel Gross and Nat Friedman, and Y Combinator.

Announcing our $50M Series C to build superhuman Speech AI models

I’m excited to share that we’ve raised $50M in Series C funding led by Accel, our partners that also led our Series A, with participation from Keith Block and Smith Point Capital, Insight Partners, Daniel Gross and Nat Friedman, and Y Combinator. This brings AssemblyAI’s total funds raised to $115M — 90% of which we’ve raised in the last 22 months, as organizations across virtually every industry have raced to embed Speech AI capabilities into their products, systems, and workflows.

We founded AssemblyAI with the vision of creating superhuman Speech AI models that would unlock an entirely new class of AI applications to be built leveraging voice data. There is a tremendous amount of information embedded within human speech. Think of all the knowledge that exists within a company's virtual meetings, for example. Or podcast and video data on the internet, phone calls into a small business or large contact centers, or even the ability to interact with machines using your voice. Being able to accurately understand, interpret, and build on top of voice data unlocks a tremendous amount of new opportunities for organizations across every industry.

Over the past two years, we’ve seen the combination of bigger datasets, better compute, and new neural network architectures like the Transformer make possible the significant advancement of AI models across nearly every modality — and make our vision of building superhuman Speech AI models more achievable than ever before.

Take our latest Conformer-2 model, for example. This model was trained on 1.1M hours of voice data, and achieved industry-leading accuracy and robustness for various tasks like speech-to-text and speaker identification upon release. Conformer-2 makes up to 43% fewer errors on noisy data compared to other models and demonstrated a nearly 50% accuracy improvement compared to our previous generation of models. This greater level of accuracy and capability has helped our customers like Fireflies.ai to offer far more useful and reliable AI-meeting notes to their millions of users.

Over the past six months, we’ve also been hard at work on our next-gen Universal model which will be a new state-of-the-art on several multilingual Speech AI tasks. This new model is being trained on >10M hours of voice data (1 petabyte) leveraging Google’s new TPU chips — and represents a 1,250x increase in training data compared to the first-ever model made available by AssemblyAI back in 2019. Our team is very excited to release this next-gen model to our users in the very near future!

There also now exist incredibly capable LLMs that can be used to ingest accurately recognized speech and generate summaries, insights, takeaways, and classifications that are enabling entirely new products and workflows to be created with voice data for the first time ever. This new LLM technology is what our popular Audio Intelligence models like Auto Chapters and Content Moderation are based on — which power brand safety and content moderation workloads at scale for leading enterprise companies — as well as our latest product LeMUR that can be used to perform text generation tasks over recognized speech.

The combination of these new capabilities have enabled thousands of fast-growing organizations to build powerful Speech AI capabilities into their products and workflows on top of our models. We’re now regularly serving over 25M inference calls, and processing over 10TB of voice data, every day through our API for our customers that include industry-leading startups like Fireflies.ai, Veed, TypeForm, Close, Loop Media, and CallRail. And, with 10,000+ new organizations signing up for our API every month, we're just scratching the surface of the new voice-powered AI applications we'll see enter the market over the next year.

While we're very proud of the progress we've made over the last two years — it's very much still Day 1 for us and we know there's a lot of work ahead. This new capital will support our ambitious research plans, new model development, training compute, market expansion, as well as help us build our team. We believe that the best way for us to continue to innovate is to bring together some of the best minds in AI, and we're proud to have had an impressive roster of research leaders and scientists from DeepMind, Microsoft, Google, Amazon, and Meta join us over the past year.

I’m so grateful for the opportunity to work with all the great software developers and customers building with our API — and I couldn’t be more thankful for all of the support, feedback, and trust our customers have given us over the years.

There is a lot more to come — stay tuned!