Notes on nonnegative tensor factorization of the spectrogram for audio source separation : statistical insights and towards self-clustering of the spatial cues

Cédric Févotte 1 Alexey Ozerov 2
2 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : Nonnegative tensor factorization (NTF) of multichannel spectrograms under PARAFAC structure has recently been proposed by Fitzgerald et al as a mean of performing blind source separation (BSS) of multichannel audio data. In this paper we investigate the statistical source models implied by this approach. We show that it implicitly assumes a nonpoint-source model contrasting with usual BSS assumptions and we clarify the links between the measure of fit chosen for the NTF and the implied statistical distribution of the sources. While the original approach of Fitzgeral et al requires a posterior clustering of the spatial cues to group the NTF components into sources, we discuss means of performing the clustering within the factorization. In the results section we test the impact of the simplifying nonpoint-source assumption on underdetermined linear instantaneous mixtures of musical sources and discuss the limits of the approach for such mixtures.
Type de document :
Communication dans un congrès
7th International Symposium on Computer Music Modeling and Retrieval (CMMR 2010), Oct 2010, Málaga, Spain. pp.www.cmmr2010.etsit.uma.es, 2010
Liste complète des métadonnées

Littérature citée [19 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00553355
Contributeur : Alexey Ozerov <>
Soumis le : vendredi 7 janvier 2011 - 11:37:53
Dernière modification le : jeudi 11 janvier 2018 - 06:23:38
Document(s) archivé(s) le : vendredi 8 avril 2011 - 02:26:14

Fichier

FevotteOzerov_CMMR10.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00553355, version 1

Citation

Cédric Févotte, Alexey Ozerov. Notes on nonnegative tensor factorization of the spectrogram for audio source separation : statistical insights and towards self-clustering of the spatial cues. 7th International Symposium on Computer Music Modeling and Retrieval (CMMR 2010), Oct 2010, Málaga, Spain. pp.www.cmmr2010.etsit.uma.es, 2010. 〈inria-00553355〉

Partager

Métriques

Consultations de la notice

334

Téléchargements de fichiers

253