Variance reduction in purely random forests

Robin Genuer 1
1 SISTM - Statistics In System biology and Translational Medicine
Epidémiologie et Biostatistique [Bordeaux], Inria Bordeaux - Sud-Ouest
Abstract : Random forests, introduced by Leo Breiman in 2001, are a very effective statistical method. The complex mechanism of the method makes theoretical analysis difficult. Therefore, simplified versions of random forests, called purely random forests, which can be theoretically handled more easily, have been considered. In this paper we study the variance of such forests. First, we show a general upper bound which emphasizes the fact that a forest reduces the variance. We then introduce a simple variant of purely random forests, that we call purely uniformly random forests. For this variant and in the context of regression problems with a one-dimensional predictor space, we show that both random trees and random forests reach minimax rate of convergence. In addition, we prove that compared to random trees, random forests improve accuracy by reducing the estimator variance by a factor of three fourths.
Type de document :
Article dans une revue
Journal of Nonparametric Statistics, American Statistical Association, 2012, 2, pp.18 - 562. 〈10.1007/978-1-4899-0027-2〉
Liste complète des métadonnées

Littérature citée [13 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01590513
Contributeur : Robin Genuer <>
Soumis le : mardi 19 septembre 2017 - 16:34:24
Dernière modification le : mercredi 20 septembre 2017 - 01:10:42

Fichier

genuer.var-reduc-prf-preprint....
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

Collections

Citation

Robin Genuer. Variance reduction in purely random forests. Journal of Nonparametric Statistics, American Statistical Association, 2012, 2, pp.18 - 562. 〈10.1007/978-1-4899-0027-2〉. 〈hal-01590513〉

Partager

Métriques

Consultations de la notice

50

Téléchargements de fichiers

16