Text-informed audio source separation. Example-based approach using non-negative matrix partial co-factorization

Luc Le Magoarou 1 Alexey Ozerov 2 Ngoc Duong 2
1 PANAMA - Parcimonie et Nouveaux Algorithmes pour le Signal et la Modélisation Audio
IRISA-D5 - SIGNAUX ET IMAGES NUMÉRIQUES, ROBOTIQUE, Inria Rennes – Bretagne Atlantique
Abstract : The so-called informed audio source separation, where the separation process is guided by some auxiliary information, has recently attracted a lot of research interest since classical blind or non-informed approaches often do not lead to satisfactory performances in many practical applications. In this paper we present a novel text-informed framework in which a target speech source can be separated from the background in the mixture using the corresponding textual information. First, given the text, we propose to produce a speech example via either a speech synthesizer or a human. We then use this example to guide source separation and, for that purpose, we introduce a new variant of the non-negative matrix partial co-factorization (NMPCF) model based on a so-called excitation-filter-channel speech model. Such a modeling allows sharing the linguistic information between the speech example and the speech in the mixture. The corresponding multiplicative update (MU) rules are eventually derived for the parameters estimation and several extensions of the model are proposed and investigated. We perform extensive experiments to assess the effectiveness of the proposed approach in terms of source separation and alignment performance.
Type de document :
Article dans une revue
Journal of Signal Processing Systems, Springer, 2014, pp.13
Liste complète des métadonnées

Littérature citée [29 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01010602
Contributeur : Luc Le Magoarou <>
Soumis le : vendredi 20 juin 2014 - 09:58:33
Dernière modification le : mercredi 16 mai 2018 - 11:24:07
Document(s) archivé(s) le : samedi 20 septembre 2014 - 10:45:37

Fichier

template.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01010602, version 1

Citation

Luc Le Magoarou, Alexey Ozerov, Ngoc Duong. Text-informed audio source separation. Example-based approach using non-negative matrix partial co-factorization. Journal of Signal Processing Systems, Springer, 2014, pp.13. 〈hal-01010602〉

Partager

Métriques

Consultations de la notice

404

Téléchargements de fichiers

441