Dynamic Bayesian networks for symbolic polyphonic pitch modeling

Stanislaw Raczynski; Emmanuel Vincent; Shigeki Sagayama

Rapport (Rapport Technique) Année : 2012

Dynamic Bayesian networks for symbolic polyphonic pitch modeling

(1) , (1) , (2)

1
2

Stanislaw Raczynski

Fonction : Auteur
PersonId : 929669

Speech and sound data modeling and processing

Emmanuel Vincent

Fonction : Auteur
PersonId : 1256
IdHAL : emmanuelv
ORCID : 0000-0002-0183-7289
IdRef : 089360176

Speech and sound data modeling and processing

Shigeki Sagayama

Fonction : Auteur

The University of Tokyo

Résumé

Symbolic pitch modeling is a way of incorporating knowledge about relations between pitches into the process of analyzing musical information or signals, and it is typically done in a statistical framework. It has proven to be an efficient way of improving the performance of various Music Information Retrieval (MIR) algorithms. In this paper, we propose a family of probabilistic symbolic polyphonic pitch models for multiple pitch estimation, which account for both the ''horizontal'' and the ''vertical'' pitch structure. These models are formulated as linear or log-linear interpolations of up to five submodels, each of which is responsible for modeling a different aspect of music. The ability of the models to predict symbolic pitch data is evaluated in terms of their cross-entropy, and of a newly proposed ''contextual cross-entropy'' measure. Their performance is then measured on synthesized polyphonic audio signals in terms of the accuracy of multiple pitch estimation in combination with a Nonnegative Matrix Factorization-based acoustic model. In both experiments, the log-linear combinations of at least one ''horizontal'' (e.g.\ harmony) and one ''vertical'' (e.g.\ note duration) models outperformed the baseline methods, by almost 60\% in cross-entropy reduction and almost 4\% in multiple pitch estimation accuracy. This work provides a proof of concept of the usefulness of model interpolation in the area of pitch modeling, which may be used for improved symbolic modeling in the future.

Domaines

Machine Learning [stat.ML]

Fichier principal

RT-430_updated.pdf (7.95 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Stanislaw Raczynski : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00728771

Soumis le : lundi 25 mars 2013-17:37:02

Dernière modification le : vendredi 24 mars 2023-14:52:56

Archivage à long terme le : dimanche 2 avril 2017-20:02:16

Dates et versions

hal-00728771 , version 1 (06-09-2012)

hal-00728771 , version 2 (26-10-2012)

hal-00728771 , version 3 (25-03-2013)

Identifiants

HAL Id : hal-00728771 , version 3

Citer

Stanislaw Raczynski, Emmanuel Vincent, Shigeki Sagayama. Dynamic Bayesian networks for symbolic polyphonic pitch modeling. [Technical Report] RT-0430, 2012. ⟨hal-00728771v3⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA IRISA-D5 INRIA2 LARA UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES INSA-GROUPE UR1-MATH-NUM

274 Consultations

316 Téléchargements

Dynamic Bayesian networks for symbolic polyphonic pitch modeling

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager