Text-informed audio source separation using nonnegative matrix partial co-factorization

Luc Le Magoarou; Alexey Ozerov; Ngoc Q. K. Duong

Communication Dans Un Congrès Année : 2013

Text-informed audio source separation using nonnegative matrix partial co-factorization

(1) , (1) , (1)

Luc Le Magoarou

Fonction : Auteur
PersonId : 4463
IdHAL : luc-le-magoarou
ORCID : 0000-0002-2871-4406
IdRef : 197010563

Technicolor R & I [Cesson Sévigné]

Alexey Ozerov

Fonction : Auteur
PersonId : 930358

Technicolor R & I [Cesson Sévigné]

Ngoc Q. K. Duong

Fonction : Auteur
PersonId : 946470

Technicolor R & I [Cesson Sévigné]

Résumé

We consider a single-channel source separation problem consisting in separating speech from nonstationary background such as music. We introduce a novel approach called text-informed separation, where the source separation process is guided by the corresponding textual information. First, given the text, we propose to produce a speech example via either a speech synthesizer or a human. We then use this example to guide source separation and, for that purpose, we introduce a new variant of the nonnegative matrix partial co-factorization (NMPCF) model based on a so called excitation-filter-channel speech model. The proposed NMPCF model allows sharing the linguistic information between the example speech and the speech in the mixture. We then derive the corresponding multiplicative update (MU) rules for the parameter estimation. Experimental results over different types of mixtures and speech examples show the effectiveness of the proposed approach.

Mots clés

Informed audio source separation text information nonnegative matrix partial co-factorization source-filter model

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Fichier principal

MLSP2013_FINAL_VERSION.pdf (564.03 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Alexey Ozerov : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00870066

Soumis le : vendredi 4 octobre 2013-18:00:38

Dernière modification le : mardi 31 octobre 2017-08:52:02

Archivage à long terme le : dimanche 5 janvier 2014-08:30:18

Dates et versions

hal-00870066 , version 1 (04-10-2013)

Identifiants

HAL Id : hal-00870066 , version 1

Citer

Luc Le Magoarou, Alexey Ozerov, Ngoc Q. K. Duong. Text-informed audio source separation using nonnegative matrix partial co-factorization. IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2013), Sep 2013, Southampton, United Kingdom. ⟨hal-00870066⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

172 Consultations

507 Téléchargements

Text-informed audio source separation using nonnegative matrix partial co-factorization

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager