Pack only the essentials: Adaptive dictionary learning for kernel ridge regression

Daniele Calandriello 1 Alessandro Lazaric 1 Michal Valko 1
1 SEQUEL - Sequential Learning
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Abstract : Most kernel-based methods, such as kernel regression, kernel PCA, ICA, or k-means clustering, do not scale to large datasets, because constructing and storing the kernel matrix Kn requires at least O(n2) time and space for n samples. Recent works (Alaoui 2014, Musco 2016) show that sampling points with replacement according to their ridge leverage scores (RLS) generates small dictionaries of relevant points with strong spectral approximation guarantees for Kn. The drawback of RLS-based methods is that computing exact RLS requires constructing and storing the whole kernel matrix. In this paper, we introduce SQUEAK, a new algorithm for kernel approximation based on RLS sampling that sequentially processes the dataset, storing a dictionary which creates accurate kernel matrix approximations with a number of points that only depends on the effective dimension deffgamma of the dataset. Moreover since all the RLS estimations are efficiently performed using only the small dictionary, SQUEAK never constructs the whole matrix kermatrixn, runs in linear time widetildeO(ndeffgamma3) w.r.t.n, and requires only a single pass over the dataset.
Type de document :
Communication dans un congrès
Adaptive and Scalable Nonparametric Methods in Machine Learning at Neural Information Processing Systems, 2016, Barcelona, Spain
Liste complète des métadonnées

Littérature citée [5 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01482756
Contributeur : Michal Valko <>
Soumis le : vendredi 3 mars 2017 - 18:29:55
Dernière modification le : vendredi 13 avril 2018 - 01:28:22
Document(s) archivé(s) le : mardi 6 juin 2017 - 12:01:12

Fichier

calandriello2016pack.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01482756, version 1

Collections

Citation

Daniele Calandriello, Alessandro Lazaric, Michal Valko. Pack only the essentials: Adaptive dictionary learning for kernel ridge regression. Adaptive and Scalable Nonparametric Methods in Machine Learning at Neural Information Processing Systems, 2016, Barcelona, Spain. 〈hal-01482756〉

Partager

Métriques

Consultations de la notice

260

Téléchargements de fichiers

86