Review - Text-Free Prosody-Aware Generative Spoken Language Modeling
In this week's Deep Learning Paper Review, we look at the following paper: Text-Free Prosody-Aware Generative Spoken Language Modeling.
Deep dives into AI, research, coding, and other topics.
Deep Learning
In this week's Deep Learning Paper Review, we look at the following paper: Text-Free Prosody-Aware Generative Spoken Language Modeling.
Deep Learning
AssemblyAI Speech-to-Text API's Speaker Diarization (diarisation) is the process of splitting audio or video inputs automatically based on the speaker's identity. It helps you answer the question "who spoke when?".
Deep Learning
Since the Attention Is All You Need paper, Transformers have completely redefined the field of Natural Language Processing. In this blog, we show you how to quickly fine-tune Transformers for numerous downstream tasks, that often perform really well out of the box!
Deep Learning
As part of our core research and development efforts to continue pushing the state of the art of speech recognition accuracy, in this post, we explore speech recognition architectures that are gaining new popularity in both academia and industry settings.
Deep Learning
The complete guide on how to build an end-to-end Speech Recognition model in PyTorch. Train your own CTC Deep Speech model using this tutorial.