Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Grouped variable importance with random forests and application to multiple functional data analysis

Abstract : The selection of grouped variables using the random forest algorithm is considered. First a new importance measure adapted for groups of variables is proposed. Theoretical insights into this criterion are given for additive regression models. Second, an original method for selecting functional variables based on the grouped variable importance measure is developed. Using a wavelet basis, it is proposed to regroup all of the wavelet coefficients for a given functional variable and use a wrapper selection algorithm with these groups. Various other groupings which take advantage of the frequency and time localization of the wavelet basis are proposed. An extensive simulation study is performed to illustrate the use of the grouped importance measure in this context. The method is applied to a real life problem coming from aviation safety.
Document type :
Preprints, Working Papers, ...
Complete list of metadata

Cited literature [45 references]  Display  Hide  Download

https://hal.inria.fr/hal-01084301
Contributor : Baptiste Gregorutti Connect in order to contact the contributor
Submitted on : Thursday, April 9, 2015 - 10:49:38 AM
Last modification on : Wednesday, January 26, 2022 - 3:41:51 AM
Long-term archiving on: : Tuesday, April 18, 2017 - 3:29:09 PM

File

Grouped_Variable_Importance_re...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01084301, version 2

Citation

Baptiste Gregorutti, Bertrand Michel, Philippe Saint-Pierre. Grouped variable importance with random forests and application to multiple functional data analysis. 2015. ⟨hal-01084301v2⟩

Share

Metrics

Les métriques sont temporairement indisponibles