Influence functions for CART - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2014

Influence functions for CART

Résumé

This paper deals with measuring the influence of observations on the results obtained with CART classification trees. To define the influence of individuals on the analysis, we use influence functions to propose some general criterions to measure the sensitivity of the CART analysis and its robustness. The proposals, based on jakknife trees, are organized around two lines: influence on predictions and influence on partitions. In addition, the analysis is extended to the pruned sequences of CART trees to produce a CART specific notion of influence. A numerical example, the well known spam dataset, is presented to illustrate the notions developed throughout the paper. A real dataset relating the administrative classification of cities surrounding Paris, France, to the characteristics of their tax revenues distribution, is finally analyzed using the new influence-based tools.
Fichier principal
Vignette du fichier
cart.influence.pdf (324.16 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00944098 , version 1 (10-02-2014)

Identifiants

  • HAL Id : hal-00944098 , version 1

Citer

Avner Bar Hen, Servane Gey, Jean-Michel Poggi. Influence functions for CART. 2014. ⟨hal-00944098⟩
251 Consultations
162 Téléchargements

Partager

Gmail Facebook X LinkedIn More