Skip to Main content Skip to Navigation
New interface
Conference papers

VSURF : un package R pour la sélection de variables à l'aide de forêts aléatoires

Abstract : This paper describes the R package VSURF. Based on random forests, it delivers two subsets of variables according to two types of variable selection for clas-sification or regression problems. The first is a subset of important variables which are relevant for interpretation, while the second one is a subset corresponding to a parsimo-nious prediction model. The strategy is based on a preliminary ranking of the explanatory variables using the random forests permutation-based score of importance and proceeds using a stepwise ascending variable introduction strategy. The two proposals can be ob-tained automatically using data-driven default values, good enough to provide interesting results, but can also be fine-tuned by the user. The algorithm is illustrated on a simulated example and its applications to real datasets are presented.
Document type :
Conference papers
Complete list of metadata

Cited literature [13 references]  Display  Hide  Download
Contributor : Robin Genuer Connect in order to contact the contributor
Submitted on : Wednesday, December 17, 2014 - 9:39:03 AM
Last modification on : Thursday, August 4, 2022 - 4:58:11 PM
Long-term archiving on: : Monday, March 23, 2015 - 2:40:49 PM


Files produced by the author(s)


  • HAL Id : hal-01096233, version 1


Robin Genuer, Jean-Michel Poggi, Christine Tuleau-Malot. VSURF : un package R pour la sélection de variables à l'aide de forêts aléatoires. 46èmes Journées de Statistique, 2014, Rennes, France. ⟨hal-01096233⟩



Record views


Files downloads