Skip to Main content Skip to Navigation
New interface
Reports (Technical report)

Dynamic Bayesian networks for symbolic polyphonic pitch modeling

Stanislaw Raczynski 1 Emmanuel Vincent 1 Shigeki Sagayama 2 
1 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : Symbolic pitch modeling is a way of incorporating knowledge about relations between pitches into the process of analyzing musical information or signals, and it is typically done in a statistical framework. It has proven to be an efficient way of improving the performance of various Music Information Retrieval (MIR) algorithms. In this paper, we propose a family of probabilistic symbolic polyphonic pitch models for multiple pitch estimation, which account for both the ''horizontal'' and the ''vertical'' pitch structure. These models are formulated as linear or log-linear interpolations of up to five submodels, each of which is responsible for modeling a different aspect of music. The ability of the models to predict symbolic pitch data is evaluated in terms of their cross-entropy, and of a newly proposed ''contextual cross-entropy'' measure. Their performance is then measured on synthesized polyphonic audio signals in terms of the accuracy of multiple pitch estimation in combination with a Nonnegative Matrix Factorization-based acoustic model. In both experiments, the log-linear combinations of at least one ''horizontal'' (e.g.\ harmony) and one ''vertical'' (e.g.\ note duration) models outperformed the baseline methods, by almost 60\% in cross-entropy reduction and almost 4\% in multiple pitch estimation accuracy. This work provides a proof of concept of the usefulness of model interpolation in the area of pitch modeling, which may be used for improved symbolic modeling in the future.
Document type :
Reports (Technical report)
Complete list of metadata
Contributor : Stanislaw Raczynski Connect in order to contact the contributor
Submitted on : Monday, March 25, 2013 - 5:37:02 PM
Last modification on : Wednesday, October 26, 2022 - 8:14:00 AM
Long-term archiving on: : Sunday, April 2, 2017 - 8:02:16 PM


Files produced by the author(s)


  • HAL Id : hal-00728771, version 3


Stanislaw Raczynski, Emmanuel Vincent, Shigeki Sagayama. Dynamic Bayesian networks for symbolic polyphonic pitch modeling. [Technical Report] RT-0430, 2012. ⟨hal-00728771v3⟩



Record views


Files downloads