Sketching for Large-Scale Learning of Mixture Models

Nicolas Keriven 1 Anthony Bourrier 1, 2 Rémi Gribonval 1 Patrick Perez 2
1 PANAMA - Parcimonie et Nouveaux Algorithmes pour le Signal et la Modélisation Audio
Inria Rennes – Bretagne Atlantique , IRISA-D5 - SIGNAUX ET IMAGES NUMÉRIQUES, ROBOTIQUE
Abstract : Learning parameters from voluminous data can be prohibitive in terms of memory and computational requirements. We propose a "compressive learning'' framework where we first sketch the data by computing random generalized moments of the underlying probability distribution, then estimate mixture model parameters from the sketch using an iterative algorithm analogous to greedy sparse signal recovery. We exemplify our framework with the sketched estimation of Gaussian Mixture Models (GMMs). We experimentally show that our approach yields results comparable to the classical Expectation-Maximization (EM) technique while requiring significantly less memory and fewer computations when the number of database elements is large. We report large-scale experiments in speaker verification, where our approach makes it possible to fully exploit a corpus of 1000 hours of speech signal to learn a universal background model at scales computationally inaccessible to EM.
Type de document :
Communication dans un congrès
2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), Mar 2016, Shanghai, China
Liste complète des métadonnées

https://hal.inria.fr/hal-01208027
Contributeur : Nicolas Keriven <>
Soumis le : vendredi 23 octobre 2015 - 15:35:41
Dernière modification le : lundi 9 octobre 2017 - 13:36:04
Document(s) archivé(s) le : dimanche 24 janvier 2016 - 13:45:41

Fichier

paper.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01208027, version 2

Citation

Nicolas Keriven, Anthony Bourrier, Rémi Gribonval, Patrick Perez. Sketching for Large-Scale Learning of Mixture Models. 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), Mar 2016, Shanghai, China. 〈hal-01208027v2〉

Partager

Métriques

Consultations de la notice

107

Téléchargements de fichiers

149