Estimating the structural segmentation of popular music pieces under regularity constraints

Gabriel Sargent 1 Frédéric Bimbot 2 Emmanuel Vincent 3
1 LinkMedia - Creating and exploiting explicit links between multimedia fragments
Inria Rennes – Bretagne Atlantique , IRISA_D6 - MEDIA ET INTERACTIONS
2 PANAMA - Parcimonie et Nouveaux Algorithmes pour le Signal et la Modélisation Audio
Inria Rennes – Bretagne Atlantique , IRISA_D5 - SIGNAUX ET IMAGES NUMÉRIQUES, ROBOTIQUE
3 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Music structure estimation has recently emerged as a central topic within the field of Music Information Retrieval. Indeed, as music is a highly structured information stream, knowledge of how a music piece is organized represents a key challenge to enhance the management and exploitation of large music collections. This article focuses on the benefits that can be expected from a regularity constraint on the structural segmentation of popular music pieces. Specifically here, we study how a constraint which favors structural segments of comparable size provides a better conditioning of the boundary estimation process. Firstly, we propose a formulation of the structural segmentation task as an optimization process which separates the contribution from the audio features and the one from the constraint. We illustrate how the corresponding cost function can be minimized using a Viterbi algorithm. We present briefly its implementation and results in three systems designed for and submitted to the MIREX 2010, 2011 and 2012 evaluation campaigns. Then, we explore the benefits of the regularity constraint as an efficient mean for combining the outputs of a selection of systems presented at MIREX between 2010 and 2015, yielding a level of performance competitive to that of the state-of-the-art on the ''MIREX10" dataset (100 J-Pop songs from the RWC database).
Type de document :
Article dans une revue
IEEE/ACM Transactions on Audio, Speech, and Language Processing, IEEE, 2017
Liste complète des métadonnées

Littérature citée [72 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01403210
Contributeur : Gabriel Sargent <>
Soumis le : mercredi 14 décembre 2016 - 13:34:34
Dernière modification le : mercredi 2 août 2017 - 10:07:08
Document(s) archivé(s) le : mercredi 15 mars 2017 - 13:39:01

Fichier

Sargent_et_al_TASLP.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01403210, version 1

Citation

Gabriel Sargent, Frédéric Bimbot, Emmanuel Vincent. Estimating the structural segmentation of popular music pieces under regularity constraints. IEEE/ACM Transactions on Audio, Speech, and Language Processing, IEEE, 2017. 〈hal-01403210〉

Partager

Métriques

Consultations de
la notice

589

Téléchargements du document

147