Spatial location priors for Gaussian model based reverberant audio source separation

Ngoc Q. K. Duong 1 Emmanuel Vincent 2, 3 Rémi Gribonval 4
2 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
3 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
4 PANAMA - Parcimonie et Nouveaux Algorithmes pour le Signal et la Modélisation Audio
Inria Rennes – Bretagne Atlantique , IRISA-D5 - SIGNAUX ET IMAGES NUMÉRIQUES, ROBOTIQUE
Abstract : We consider the Gaussian framework for reverberant audio source separation, where the sources are modeled in the time-frequency domain by their short-term power spectra and their spatial covariance matrices. We propose three alternative probabilistic priors over the spatial covariance matrices which are consistent with the theory of statistical room acoustics and we derive Expectation-Maximization (EM) algorithms for maximum a posteriori (MAP) estimation. We argue that these algorithms provide a statistically principled solution to the permutation problem and to the risk of overfitting resulting from conventional maximum likelihood (ML) estimation. We show experimentally that, in a semi-informed scenario where the source positions and certain room characteristics are known, the algorithms using respectively inverse-Wishart and Gaussian priors outperform their ML counterparts. This opens the way to rigorous statistical treatment of this family of models in other scenarios in the future.
Type de document :
Rapport
[Research Report] RR-8057, INRIA. 2012
Liste complète des métadonnées

Littérature citée [37 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00727781
Contributeur : Emmanuel Vincent <>
Soumis le : mardi 2 avril 2013 - 23:26:25
Dernière modification le : mardi 16 janvier 2018 - 15:54:22
Document(s) archivé(s) le : mercredi 3 juillet 2013 - 04:10:18

Fichier

RR-8057.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00727781, version 2

Citation

Ngoc Q. K. Duong, Emmanuel Vincent, Rémi Gribonval. Spatial location priors for Gaussian model based reverberant audio source separation. [Research Report] RR-8057, INRIA. 2012. 〈hal-00727781v2〉

Partager

Métriques

Consultations de la notice

1527

Téléchargements de fichiers

266