Scalable audio separation with light kernel additive modelling

Antoine Liutkus 1, 2 Derry Fitzgerald 3 Zafar Rafii 4
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
2 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Recently, Kernel Additive Modelling (KAM) was proposed as a unified framework to achieve multichannel audio source separation. Its main feature is to use kernel models for locally describing the spectrograms of the sources. Such kernels can capture source features such as repetitivity, stability over time and/or frequency, self-similarity, etc. KAM notably subsumes many popular and effective methods from the state of the art, including REPET and harmonic/percussive separation with median filters. However, it also comes with an important drawback in its initial form: its memory usage badly scales with the number of sources. Indeed, KAM requires the storage of the full-resolution spectrogram for each source, which may become prohibitive for full-length tracks or many sources. In this paper, we show how it can be combined with a fast compression algorithm of its parameters to address the scalability issue, thus enabling its use on small platforms or mobile devices.
Type de document :
Communication dans un congrès
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2015, Brisbane, Australia. 2015
Liste complète des métadonnées

Littérature citée [32 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01114890
Contributeur : Antoine Liutkus <>
Soumis le : mardi 10 février 2015 - 13:50:10
Dernière modification le : mercredi 21 février 2018 - 07:50:09

Fichier

ICASSP-lightKAM.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01114890, version 2

Collections

Citation

Antoine Liutkus, Derry Fitzgerald, Zafar Rafii. Scalable audio separation with light kernel additive modelling. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2015, Brisbane, Australia. 2015. 〈hal-01114890v2〉

Partager

Métriques

Consultations de la notice

1056

Téléchargements de fichiers

613