Harmonic and inharmonic nonnegative matrix factorization for polyphonic pitch transcription

Emmanuel Vincent 1 Nancy Bertin 2 Roland Badeau 2
1 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : Polyphonic pitch transcription consists of estimating the onset time, duration and pitch of each note in a music signal. This task is difficult in general, due to the wide range of possible instruments. This issue has been studied using adaptive models such as Nonnegative Matrix Factorization (NMF), which describe the signal as a weighted sum of basis spectra. However basis spectra representing multiple pitches result in inaccurate transcription. To avoid this, we propose a family of constrained NMF models, where each basis spectrum is expressed as a weighted sum of narrowband spectra consisting of a few adjacent partials at harmonic or inharmonic frequencies. The model parameters are adapted via combined multiplicative and Newton updates. The proposed method is shown to outperform standard NMF on a database of piano excerpts.
Liste complète des métadonnées

Cited literature [15 references]  Display  Hide  Download

https://hal.inria.fr/inria-00544183
Contributor : Emmanuel Vincent <>
Submitted on : Tuesday, December 7, 2010 - 2:35:11 PM
Last modification on : Thursday, March 21, 2019 - 2:20:42 PM
Document(s) archivé(s) le : Tuesday, March 8, 2011 - 4:25:12 AM

File

vincent_ICASSP08.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : inria-00544183, version 1

Citation

Emmanuel Vincent, Nancy Bertin, Roland Badeau. Harmonic and inharmonic nonnegative matrix factorization for polyphonic pitch transcription. 2008 IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Mar 2008, Las Vegas, United States. pp.109--112. ⟨inria-00544183⟩

Share

Metrics

Record views

437

Files downloads

627