Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly

Benjamin Guedj 1, 2, 3, 4 Le Li 5
1 MODAL - MOdel for Data Analysis and Learning
Inria Lille - Nord Europe, LPP - Laboratoire Paul Painlevé - UMR 8524, METRICS - Evaluation des technologies de santé et des pratiques médicales - ULR 2694, Polytech Lille - École polytechnique universitaire de Lille, Université de Lille, Sciences et Technologies
Abstract : When confronted with massive data streams, summarizing data with dimension reduction methods such as PCA raises theoretical and algorithmic pitfalls. Principal curves act as a nonlinear generalization of PCA and the present paper proposes a novel algorithm to automatically and sequentially learn principal curves from data streams. We show that our procedure is supported by regret bounds with optimal sublinear remainder terms. A greedy local search implementation (called \texttt{slpc}, for Sequential Learning Principal Curves) that incorporates both sleeping experts and multi-armed bandit ingredients is presented, along with its regret computation and performance on synthetic and real-life data.
Complete list of metadatas

Cited literature [31 references]  Display  Hide  Download

https://hal.inria.fr/hal-01796011
Contributor : Benjamin Guedj <>
Submitted on : Wednesday, May 8, 2019 - 10:06:29 PM
Last modification on : Friday, November 27, 2020 - 2:18:03 PM

File

main-pcurves.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01796011, version 2

Collections

Citation

Benjamin Guedj, Le Li. Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly. 2019. ⟨hal-01796011v2⟩

Share

Metrics

Record views

110

Files downloads

676