Grouped variable importance with random forests and application to multiple functional data analysis

Abstract : The selection of grouped variables using the random forest algorithm is considered. First a new importance measure adapted for groups of variables is proposed. Theoretical insights into this criterion are given for additive regression models. Second, an original method for selecting functional variables based on the grouped variable importance measure is developed. Using a wavelet basis, it is proposed to regroup all of the wavelet coefficients for a given functional variable and use a wrapper selection algorithm with these groups. Various other groupings which take advantage of the frequency and time localization of the wavelet basis are proposed. An extensive simulation study is performed to illustrate the use of the grouped importance measure in this context. The method is applied to a real life problem coming from aviation safety.
Type de document :
Pré-publication, Document de travail
2015
Liste complète des métadonnées

Littérature citée [45 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01084301
Contributeur : Baptiste Gregorutti <>
Soumis le : jeudi 9 avril 2015 - 10:49:38
Dernière modification le : mercredi 29 novembre 2017 - 16:32:52
Document(s) archivé(s) le : mardi 18 avril 2017 - 15:29:09

Fichier

Grouped_Variable_Importance_re...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01084301, version 2

Collections

Citation

Baptiste Gregorutti, Bertrand Michel, Philippe Saint-Pierre. Grouped variable importance with random forests and application to multiple functional data analysis. 2015. 〈hal-01084301v2〉

Partager

Métriques

Consultations de la notice

324

Téléchargements de fichiers

253