Dynamic Bayesian networks for symbolic polyphonic pitch modeling - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Rapport (Rapport Technique) Année : 2012

Dynamic Bayesian networks for symbolic polyphonic pitch modeling

Résumé

Symbolic pitch modeling is a way of incorporating knowledge about relations between pitches into the process of analyzing musical information or signals, and it is typically done in a statistical framework. It has proven to be an efficient way of improving the performance of various Music Information Retrieval (MIR) algorithms. In this paper, we propose a family of probabilistic symbolic polyphonic pitch models for multiple pitch estimation, which account for both the ''horizontal'' and the ''vertical'' pitch structure. These models are formulated as linear or log-linear interpolations of up to five submodels, each of which is responsible for modeling a different aspect of music. The ability of the models to predict symbolic pitch data is evaluated in terms of their cross-entropy, and of a newly proposed ''contextual cross-entropy'' measure. Their performance is then measured on synthesized polyphonic audio signals in terms of the accuracy of multiple pitch estimation in combination with a Nonnegative Matrix Factorization-based acoustic model. In both experiments, the log-linear combinations of at least one ''horizontal'' (e.g.\ harmony) and one ''vertical'' (e.g.\ note duration) models outperformed the baseline methods, by almost 60\% in cross-entropy reduction and almost 4\% in multiple pitch estimation accuracy. This work provides a proof of concept of the usefulness of model interpolation in the area of pitch modeling, which may be used for improved symbolic modeling in the future.
Fichier principal
Vignette du fichier
RT-430_updated.pdf (7.95 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00728771 , version 1 (06-09-2012)
hal-00728771 , version 2 (26-10-2012)
hal-00728771 , version 3 (25-03-2013)

Identifiants

  • HAL Id : hal-00728771 , version 3

Citer

Stanislaw Raczynski, Emmanuel Vincent, Shigeki Sagayama. Dynamic Bayesian networks for symbolic polyphonic pitch modeling. [Technical Report] RT-0430, 2012. ⟨hal-00728771v3⟩
273 Consultations
315 Téléchargements

Partager

Gmail Facebook X LinkedIn More