Skip to Main content Skip to Navigation
Conference papers

Using full-rank spatial covariance models for noise-robust ASR

Dung Tran 1 Emmanuel Vincent 1 Denis Jouvet 1 Kamil Adiloglu 2
1 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : We present a joint spatial and spectral denoising front-end for Track 1 of the 2nd CHiME Speech Separation and Recognition Challenge based on the Flexible Audio Source Separation Toolbox (FASST). We represent the sources by nonnegative matrix factorization (NMF) and full-rank spatial covariances, which are known to be appropriate for the modeling of small source movements. We then learn acoustic models for automatic speech recognition (ASR) on the enhanced training data. We obtain 40% average error rate reduction due to speech separation compared to multicondition training alone.
Complete list of metadatas

Cited literature [5 references]  Display  Hide  Download
Contributor : Emmanuel Vincent <>
Submitted on : Friday, March 15, 2013 - 10:59:07 AM
Last modification on : Monday, May 4, 2020 - 11:39:00 AM
Long-term archiving on: : Monday, June 17, 2013 - 2:22:18 PM


Files produced by the author(s)


  • HAL Id : hal-00801162, version 1


Dung Tran, Emmanuel Vincent, Denis Jouvet, Kamil Adiloglu. Using full-rank spatial covariance models for noise-robust ASR. CHiME - 2nd International Workshop on Machine Listening in Multisource Environments - 2013, Jun 2013, Vancouver, Canada. pp.31-32. ⟨hal-00801162⟩



Record views


Files downloads