Skip to Main content Skip to Navigation
Conference papers

Text-informed audio source separation using nonnegative matrix partial co-factorization

Abstract : We consider a single-channel source separation problem consisting in separating speech from nonstationary background such as music. We introduce a novel approach called text-informed separation, where the source separation process is guided by the corresponding textual information. First, given the text, we propose to produce a speech example via either a speech synthesizer or a human. We then use this example to guide source separation and, for that purpose, we introduce a new variant of the nonnegative matrix partial co-factorization (NMPCF) model based on a so called excitation-filter-channel speech model. The proposed NMPCF model allows sharing the linguistic information between the example speech and the speech in the mixture. We then derive the corresponding multiplicative update (MU) rules for the parameter estimation. Experimental results over different types of mixtures and speech examples show the effectiveness of the proposed approach.
Complete list of metadata

Cited literature [24 references]  Display  Hide  Download

https://hal.inria.fr/hal-00870066
Contributor : Alexey Ozerov <>
Submitted on : Friday, October 4, 2013 - 6:00:38 PM
Last modification on : Tuesday, October 31, 2017 - 8:52:02 AM
Long-term archiving on: : Sunday, January 5, 2014 - 8:30:18 AM

File

MLSP2013_FINAL_VERSION.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00870066, version 1

Citation

Luc Le Magoarou, Alexey Ozerov, Ngoc Duong. Text-informed audio source separation using nonnegative matrix partial co-factorization. IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2013), Sep 2013, Southampton, United Kingdom. ⟨hal-00870066⟩

Share

Metrics

Record views

512

Files downloads

711