Skip to Main content Skip to Navigation
Master thesis

Online learning for audio clustering and segmentation

Alberto Bietti 1, 2
1 MuTant - Synchronous Realtime Processing and Programming of Music Signals
Inria Paris-Rocquencourt, UPMC - Université Pierre et Marie Curie - Paris 6, IRCAM - Institut de Recherche et Coordination Acoustique/Musique, CNRS - Centre National de la Recherche Scientifique
2 SIERRA - Statistical Machine Learning and Parsimony
DI-ENS - Département d'informatique de l'École normale supérieure, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
Abstract : Audio segmentation is an essential problem in many audio signal processing tasks which tries to segment an audio signal into homogeneous chunks, or segments. Most current approaches rely on a change-point detection phase for finding segment boundaries, followed by a similarity matching phase which identifies similar segments. In this thesis, we focus instead on joint segmentation and clustering algorithms which solve both tasks simultaneously, through the use of unsupervised learning techniques in sequential models. Hidden Markov and semi-Markov models are a natural choice for this modeling task, and we present their use in the context of audio segmentation. We then explore the use of online learning techniques in sequential models and their application to real-time audio segmentation tasks. We present an existing online EM algorithm for hidden Markov models and extend it to hidden semi-Markov models by introducing a different parameterization of semi-Markov chains. Finally, we develop new online learning algorithms for sequential models based on incremental optimization of surrogate functions.
Complete list of metadata

Cited literature [43 references]  Display  Hide  Download

https://hal.inria.fr/hal-01064672
Contributor : Alberto Bietti <>
Submitted on : Thursday, October 9, 2014 - 2:21:32 AM
Last modification on : Tuesday, July 13, 2021 - 2:17:09 PM
Long-term archiving on: : Saturday, January 10, 2015 - 10:10:36 AM

File

ms-thesis.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01064672, version 2

Collections

Citation

Alberto Bietti. Online learning for audio clustering and segmentation. Machine Learning [cs.LG]. 2014. ⟨hal-01064672v2⟩

Share

Metrics

Record views

627

Files downloads

3875