Large-Scale High-Dimensional Clustering with Fast Sketching

Antoine Chatalic 1 Rémi Gribonval 1 Nicolas Keriven 1
1 PANAMA - Parcimonie et Nouveaux Algorithmes pour le Signal et la Modélisation Audio
Inria Rennes – Bretagne Atlantique , IRISA_D5 - SIGNAUX ET IMAGES NUMÉRIQUES, ROBOTIQUE
Abstract : In this paper, we address the problem of high-dimensional k-means clustering in a large-scale setting, i.e. for datasets that comprise a large number of items. Sketching techniques have already been used to deal with this “large-scale” issue, by compressing the whole dataset into a single vector of random nonlinear generalized moments from which the k centroids are then retrieved efficiently. However , this approach usually scales quadratically with the dimension; to cope with high-dimensional datasets, we show how to use fast structured random matrices to compute the sketching operator efficiently. This yields significant speed-ups and memory savings for high-dimensional data, while the clustering results are shown to be much more stable, both on artificial and real datasets.
Type de document :
Communication dans un congrès
ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing, Apr 2018, Calgary, Canada. IEEE, pp.4714-4718, 〈https://2018.ieeeicassp.org/〉. 〈10.1109/ICASSP.2018.8461328〉
Liste complète des métadonnées

Littérature citée [33 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01701121
Contributeur : Antoine Chatalic <>
Soumis le : lundi 5 février 2018 - 15:52:59
Dernière modification le : jeudi 15 novembre 2018 - 11:59:00
Document(s) archivé(s) le : samedi 5 mai 2018 - 02:26:49

Fichier

final_with_reviews.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Antoine Chatalic, Rémi Gribonval, Nicolas Keriven. Large-Scale High-Dimensional Clustering with Fast Sketching. ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing, Apr 2018, Calgary, Canada. IEEE, pp.4714-4718, 〈https://2018.ieeeicassp.org/〉. 〈10.1109/ICASSP.2018.8461328〉. 〈hal-01701121〉

Partager

Métriques

Consultations de la notice

1063

Téléchargements de fichiers

344