Multichannel Speech Enhancement Based on Time-frequency Masking Using Subband Long Short-Term Memory - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

Multichannel Speech Enhancement Based on Time-frequency Masking Using Subband Long Short-Term Memory

Résumé

We propose a multichannel speech enhancement method using a long short-term memory (LSTM) recurrent neural network. The proposed method is developed in the short time Fourier transform (STFT) domain. An LSTM network common to all frequency bands is trained, which processes each frequency band individually by mapping the multichannel noisy STFT coefficient sequence to its corresponding STFT magnitude ratio mask sequence of one reference channel. This subband LSTM network exploits the differences between temporal/spatial characteristics of speech and noise, namely speech source is non-stationary and coherent, while noise is stationary and less spatially-correlated. Experiments with different types of noise show that the proposed method outperforms the baseline deep-learning-based full-band method and unsupervised method. In addition, since it does not learn the wideband spectral structure of either speech or noise, the proposed subband LSTM network generalizes very well to unseen speakers and noise types.
Fichier principal
Vignette du fichier
Xiaofei_WASPAA2019.pdf (230.9 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02264247 , version 1 (06-08-2019)
hal-02264247 , version 2 (14-10-2019)

Identifiants

  • HAL Id : hal-02264247 , version 1

Citer

Xiaofei Li, Radu Horaud. Multichannel Speech Enhancement Based on Time-frequency Masking Using Subband Long Short-Term Memory. WASPAA 2019 - IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Oct 2019, New Paltz, NY, United States. ⟨hal-02264247v1⟩
1074 Consultations
1251 Téléchargements

Partager

Gmail Facebook X LinkedIn More