Adaptive harmonic spectral decomposition for multiple pitch estimation

Emmanuel Vincent 1 Nancy Bertin 1 Roland Badeau 2
1 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : Multiple pitch estimation consists of estimating the fundamental frequencies and saliences of pitched sounds over short time frames of an audio signal. This task forms the basis of several applications in the particular context of musical audio. One approach is to decompose the short-term magnitude spectrum of the signal into a sum of basis spectra representing individual pitches scaled by time-varying amplitudes, using algorithms such as nonnegative matrix factorization (NMF). Prior training of the basis spectra is often infeasible due to the wide range of possible musical instruments. Appropriate spectra must then be adaptively estimated from the data, which may result in limited performance due to overfitting issues. In this article, we model each basis spectrum as a weighted sum of narrowband spectra representing a few adjacent harmonic partials, thus enforcing harmonicity and spectral smoothness while adapting the spectral envelope to each instrument. We derive a NMF-like algorithm to estimate the model parameters and evaluate it on a database of piano recordings, considering several choices for the narrowband spectra. The proposed algorithm performs similarly to supervised NMF using pre-trained piano spectra but improves pitch estimation performance by 6% to 10% compared to alternative unsupervised NMF algorithms.
Type de document :
Article dans une revue
IEEE Trans. on Audio, Speech and Language Processing, IEEE, 2010, 18 (3), pp.528--537
Liste complète des métadonnées

Littérature citée [39 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00544094
Contributeur : Emmanuel Vincent <>
Soumis le : mardi 7 décembre 2010 - 11:33:46
Dernière modification le : jeudi 11 janvier 2018 - 06:20:09
Document(s) archivé(s) le : mardi 8 mars 2011 - 03:58:18

Fichier

vincent_TASLP10.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

  • HAL Id : inria-00544094, version 1

Collections

Citation

Emmanuel Vincent, Nancy Bertin, Roland Badeau. Adaptive harmonic spectral decomposition for multiple pitch estimation. IEEE Trans. on Audio, Speech and Language Processing, IEEE, 2010, 18 (3), pp.528--537. 〈inria-00544094〉

Partager

Métriques

Consultations de la notice

549

Téléchargements de fichiers

1038