This week’s AI Research Review is Multistream CNN For Robust Acoustic Modeling
Multistream CNN For Robust Acoustic Modeling
What’s Exciting About this Paper
Multistream CNN is built on the idea that by using different dilation rates across different models, the layers are learning “different” views of features at multiple resolutions.
Key Findings
The convolution matrix in TDNN-F is decomposed into two factors with the orthonormal constraint, which apparently boosts the performance for this particular task.
Multistream CNN is basically a stack of N different convolutional layers processing the input in parallel and concatenating the outputs in the final layers.
Our Takeaways
Multi-resolution optimization helps the model learn more robust features across the different “viewpoints.” This approach could be used with different modeling techniques.
TDNN-F layers improve upon standard conv1d layers because of their mathematical nature.