Apply Noise Reduction to Audio for Streaming Speech-to-Text
This guide demonstrates how to implement a noise reduction system for real-time audio transcription using AssemblyAI’s Streaming STT and the noisereduce
library. You’ll learn how to create a custom audio stream that preprocesses incoming audio to remove background noise before it reaches the transcription service.
This solution is particularly valuable for:
- Voice assistants operating in noisy environments
- Customer service applications processing calls
- Meeting transcription tools
- Voice-enabled applications requiring high accuracy
The implementation uses Python and combines proven audio processing techniques with AssemblyAI’s powerful transcription capabilities. While our example focuses on microphone input, the principles can be applied to any real-time audio stream.
Quickstart
Step-by-step guide
First, install the following packages: assemblyai, noisereduce, numpy
Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for a free account and get your API key from your dashboard. Please note that Streaming Speech-to-text is available for upgraded accounts only. If you’re on the free plan, you’ll need to upgrade your account by adding a credit card.
Make sure not to share this token with anyone - it is a private key associated uniquely to your account.
Create functions to handle different events during transcription.
Create a custom stream class that includes noise reduction.
Now we create our transcriber and NoiseReducedMicrophoneStream
.
You can press Ctrl+C to stop the transcription.