A fast EM algorithm for Gaussian model-based source separation

Joachim Thiemann 1, 2 Emmanuel Vincent 3
1 PANAMA - Parcimonie et Nouveaux Algorithmes pour le Signal et la Modélisation Audio
IRISA-D5 - SIGNAUX ET IMAGES NUMÉRIQUES, ROBOTIQUE, Inria Rennes – Bretagne Atlantique
2 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
3 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : We consider the FASST framework for audio source separation, which models the sources by full-rank spatial covariance matrices and multilevel nonnegative matrix factorization (NMF) spectra. The computational cost of the expectation-maximization (EM) algorithm in [1] greatly increases with the number of channels. We present alternative EM updates using discrete hidden variables which exhibit a smaller cost. We evaluate the results on mixtures of speech and real-world environmental noise taken from our DEMAND database. The proposed algorithm is several orders of magnitude faster and it provides better separation quality for two-channel mixtures in low input signal-to-noise ratio (iSNR) conditions.
Complete list of metadatas

Cited literature [13 references]  Display  Hide  Download

https://hal.inria.fr/hal-00840366
Contributor : Emmanuel Vincent <>
Submitted on : Tuesday, July 2, 2013 - 12:38:04 PM
Last modification on : Wednesday, April 3, 2019 - 1:23:12 AM
Long-term archiving on : Wednesday, April 5, 2017 - 6:03:21 AM

File

thiemann_EUSIPCO13.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00840366, version 1

Citation

Joachim Thiemann, Emmanuel Vincent. A fast EM algorithm for Gaussian model-based source separation. EUSIPCO - 21st European Signal Processing Conference - 2013, Sep 2013, Marrakech, Morocco. ⟨hal-00840366⟩

Share

Metrics

Record views

847

Files downloads

495