Probabilistic model for main melody extraction using constant-Q transform

Dimension reduction techniques such as Nonnegative Tensor Factorization are now classical for both source separation and estimation of multiple fundamental frequencies in audio mixtures. Still, few studies jointly addressed these tasks so far, mainly because separation is often based on the Short Term Fourier Transform (STFT) whereas recent music analysis algorithms are rather based on the Constant-Q Transform (CQT). The CQT is practical for pitch estimation because a pitch shift amounts to a translation of the CQT representation, whereas it produces a scaling of the STFT. Conversely, no simple inversion of the CQT was available until recently, preventing it from being used for source separation. Benefiting from advances both in the inversion of the CQT and in statistical modeling, we show how recent techniques designed for music analysis can also be used for source separation with encouraging results, thus opening the path to many crossovers between separation and analysis.

Mots clés

Audio source separation NTF PLCA CQT

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Fichier principal

fuentes_ICASSP-2012.pdf (542.83 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Roland Badeau : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00945290

Soumis le : mardi 25 mars 2014-09:05:10

Dernière modification le : lundi 9 octobre 2023-12:49:40

Archivage à long terme le : mercredi 25 juin 2014-10:43:39

Dates et versions

hal-00945290 , version 1 (25-03-2014)

Identifiants

HAL Id : hal-00945290 , version 1

Citer

Benoît Fuentes, Antoine Liutkus, Roland Badeau, Gael Richard. Probabilistic model for main melody extraction using constant-Q transform. 37th International Conference on Acoustics, Speech, and Signal Processing ICASSP'12, 2012, Kyoto, Japan. pp.5357--5360. ⟨hal-00945290⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM CNRS PARISTECH LTCI IDS S2A

158 Consultations

311 Téléchargements