Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Upper and Lower Bounds on the Performance of Kernel PCA

Maxime Haddouche 1, 2, 3, 4 Benjamin Guedj 5, 2, 6, 7, 4 Omar Rivasplata 2, 5, 4 John Shawe-Taylor 2, 5, 4
7 MODAL - MOdel for Data Analysis and Learning
Inria Lille - Nord Europe, LPP - Laboratoire Paul Painlevé - UMR 8524, METRICS - Evaluation des technologies de santé et des pratiques médicales - ULR 2694, Polytech Lille - École polytechnique universitaire de Lille, Université de Lille, Sciences et Technologies
Abstract : Principal Component Analysis (PCA) is a popular method for dimension reduction and has attracted an unfailing interest for decades. Recently, kernel PCA has emerged as an extension of PCA but, despite its use in practice, a sound theoretical understanding of kernel PCA is missing. In this paper, we contribute lower and upper bounds on the efficiency of kernel PCA, involving the empirical eigenvalues of the kernel Gram matrix. Two bounds are for fixed estimators, and two are for randomized estimators through the PAC-Bayes theory. We control how much information is captured by kernel PCA on average, and we dissect the bounds to highlight strengths and limitations of the kernel PCA algorithm. Therefore, we contribute to the better understanding of kernel PCA. Our bounds are briefly illustrated on a toy numerical example.
Complete list of metadata
Contributor : Benjamin Guedj Connect in order to contact the contributor
Submitted on : Monday, December 21, 2020 - 11:29:19 AM
Last modification on : Thursday, January 20, 2022 - 4:15:59 PM


Files produced by the author(s)


  • HAL Id : hal-03084598, version 1
  • ARXIV : 2012.10369


Maxime Haddouche, Benjamin Guedj, Omar Rivasplata, John Shawe-Taylor. Upper and Lower Bounds on the Performance of Kernel PCA. 2020. ⟨hal-03084598⟩



Les métriques sont temporairement indisponibles