Newsletter #37: Speaker Diarization Now in 5 New Languages ๐จ๐ณ๐ฎ๐ณ๐ฏ๐ต๐ฐ๐ท๐ป๐ณ & Latest Speech AI tutorials
AssemblyAI's Speaker Diarization model now supports five additional languages: Chinese ๐จ๐ณ, Hindi ๐ฎ๐ณ, Japanese ๐ฏ๐ต, Korean ๐ฐ๐ท, and Vietnamese ๐ป๐ณ.



Hey ๐, this weekly update contains the latest info on our new product features, tutorials, and our community.
New Language Support for Speaker Diarization
AssemblyAI's Speaker Diarization model now supports five additional languages: Chinese ๐จ๐ณ, Hindi ๐ฎ๐ณ, Japanese ๐ฏ๐ต, Korean ๐ฐ๐ท, and Vietnamese ๐ป๐ณ. This feature is available in both ourย Bestย andย Nanoย tiers.ย
The Speaker Diarization model detects multiple speakers in an audio file and identifies what each speaker said. To start building with this feature, simply setย speaker_labelsย toย trueย in your transcription configuration. For more examples, check out ourย documentation.
Fresh From Our Blog
Automatically determine video sections with AI using Python: Learn how to automatically determine video sections, how to generate section titles with LLMs, and how to format the information for YouTube chapters.ย Read more>>
Filter profanity from audio files using Python: Learn how to filter profanity out of audio and video files with fewer than 10 lines of code in this tutorial.ย Read more>>
How to use audio data in LlamaIndex with Python: Discover how to incorporate audio files into LlamaIndex and build an LLM-powered query engine in this step-by-step tutorial.ย Read more>>
Our Trending YouTube Tutorials
Coding an AI Voice Bot from Scratch: Real-Time Conversation with Python: Learn how to build a real-time AI voice assistant using Python that can handle incoming calls, transcribe speech, generate intelligent responses, and provide a human-like conversational experience. Perfect for call centers, customer support, and virtual receptionist applications.ย
How to use @postman to test LLMs with audio data (Transcribe and Understand): Learn how to transcribe audio and video files using AssemblyAI and also how to use LeMUR, AssemblyAI's framework for using Large Language Models on spoken data without having to code at all.
Build A Talking AI with LLAMA 3 (Python tutorial): This tutorial shows you how to build a talking AI using real-time transcription with AssemblyAI, using LLAMA 3 as the language model with Ollama, and ElevenLabs for text-to-speech.ย
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.