21744 articles – 15574 references  [version française]

inria-00483017, version 1

Brownian Motions and Scrambled Wavelets for Least-Squares Regression

Odalric-Ambrym Maillard () 1, Rémi Munos () 1

(2010)

Abstract: We consider ordinary (non penalized) least-squares regression where the regression function is chosen in a randomly generated sub-space GP \subset S of finite dimension P, where S is a function space of infinite dimension, e.g. L2([0, 1]^d). GP is defined as the span of P random features that are linear combinations of the basis functions of S weighted by random Gaussian i.i.d. coefficients. We characterize the so-called kernel space K \subset S of the resulting Gaussian process and derive approximation error bounds of order O(||f||^2_K log(P)/P) for functions f \in K approximated in GP . We apply this result to derive excess risk bounds for the least-squares estimate in various spaces. For illustration, we consider regression using the so-called scrambled wavelets (i.e. random linear combinations of wavelets of L2([0, 1]^d)) and derive an excess risk rate O(||f*||_K(logN)/sqrt(N)) which is arbitrarily close to the minimax optimal rate (up to a logarithmic factor) for target functions f* in K = H^s([0, 1]^d), a Sobolev space of smoothness order s > d/2. We describe an efficient implementation using lazy expansions with numerical complexity ˜O(2dN^3/2 logN+N^5/2), where d is the dimension of the input data and N is the number of data.

  • 1:  SEQUEL (INRIA Lille - Nord Europe)
  • INRIA – CNRS : UMR8146 – Université Lille I - Sciences et technologies – Université Lille III - Sciences humaines et sociales – Ecole Centrale de Lille
  • Domain : Statistics/Machine Learning
    Computer Science/Learning
    Mathematics/Statistics
    Statistics/Statistics Theory
 
  • inria-00483017, version 1
  • oai:hal.inria.fr:inria-00483017
  • From: 
  • Submitted on: Wednesday, 12 May 2010 11:54:50
  • Updated on: Wednesday, 12 May 2010 15:58:02