Skip to Main content Skip to Navigation
New interface
Conference papers

Text-informed audio source separation using nonnegative matrix partial co-factorization

Abstract : We consider a single-channel source separation problem consisting in separating speech from nonstationary background such as music. We introduce a novel approach called text-informed separation, where the source separation process is guided by the corresponding textual information. First, given the text, we propose to produce a speech example via either a speech synthesizer or a human. We then use this example to guide source separation and, for that purpose, we introduce a new variant of the nonnegative matrix partial co-factorization (NMPCF) model based on a so called excitation-filter-channel speech model. The proposed NMPCF model allows sharing the linguistic information between the example speech and the speech in the mixture. We then derive the corresponding multiplicative update (MU) rules for the parameter estimation. Experimental results over different types of mixtures and speech examples show the effectiveness of the proposed approach.
Complete list of metadata

Cited literature [24 references]  Display  Hide  Download
Contributor : Alexey Ozerov Connect in order to contact the contributor
Submitted on : Friday, October 4, 2013 - 6:00:38 PM
Last modification on : Tuesday, October 31, 2017 - 8:52:02 AM
Long-term archiving on: : Sunday, January 5, 2014 - 8:30:18 AM


Files produced by the author(s)


  • HAL Id : hal-00870066, version 1


Luc Le Magoarou, Alexey Ozerov, Ngoc Q. K. Duong. Text-informed audio source separation using nonnegative matrix partial co-factorization. IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2013), Sep 2013, Southampton, United Kingdom. ⟨hal-00870066⟩



Record views


Files downloads