Notes on nonnegative tensor factorization of the spectrogram for audio source separation : statistical insights and towards self-clustering of the spatial cues

Cédric Févotte; Alexey Ozerov

Communication Dans Un Congrès Année : 2010

Notes on nonnegative tensor factorization of the spectrogram for audio source separation : statistical insights and towards self-clustering of the spatial cues

(1) , (2)

1
2

Cédric Févotte

Fonction : Auteur
PersonId : 184864
IdHAL : cedric-fevotte
ORCID : 0000-0003-3801-5534
IdRef : 083298460

Laboratoire Traitement et Communication de l'Information

Alexey Ozerov

Fonction : Auteur
PersonId : 882775

Speech and sound data modeling and processing

Résumé

Nonnegative tensor factorization (NTF) of multichannel spectrograms under PARAFAC structure has recently been proposed by Fitzgerald et al as a mean of performing blind source separation (BSS) of multichannel audio data. In this paper we investigate the statistical source models implied by this approach. We show that it implicitly assumes a nonpoint-source model contrasting with usual BSS assumptions and we clarify the links between the measure of fit chosen for the NTF and the implied statistical distribution of the sources. While the original approach of Fitzgeral et al requires a posterior clustering of the spatial cues to group the NTF components into sources, we discuss means of performing the clustering within the factorization. In the results section we test the impact of the simplifying nonpoint-source assumption on underdetermined linear instantaneous mixtures of musical sources and discuss the limits of the approach for such mixtures.

Mots clés

Nonnegative tensor factorization (NTF) audio source separation nonpoint-source models multiplicative parameter updates

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Fichier principal

FevotteOzerov_CMMR10.pdf (180.19 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Alexey Ozerov : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00553355

Soumis le : vendredi 7 janvier 2011-11:37:53

Dernière modification le : lundi 9 octobre 2023-12:49:40

Archivage à long terme le : vendredi 8 avril 2011-02:26:14

Dates et versions

inria-00553355 , version 1 (07-01-2011)

Identifiants

HAL Id : inria-00553355 , version 1

Citer

Cédric Févotte, Alexey Ozerov. Notes on nonnegative tensor factorization of the spectrogram for audio source separation : statistical insights and towards self-clustering of the spatial cues. 7th International Symposium on Computer Music Modeling and Retrieval (CMMR 2010), Oct 2010, Málaga, Spain. pp.www.cmmr2010.etsit.uma.es. ⟨inria-00553355⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM EC-PARIS UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA PARISTECH IRISA-D5 INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES INSA-GROUPE LTCI UR1-MATH-NUM

250 Consultations

473 Téléchargements

Notes on nonnegative tensor factorization of the spectrogram for audio source separation : statistical insights and towards self-clustering of the spatial cues

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager