A Statistical Learning Perspective of Genetic Programming}

Merve Amil 1, 2 Nicolas Bredeche 1, 2 Christian Gagné 1, 2 Sylvain Gelly 1, 2 Marc Schoenauer 1, 2 Olivier Teytaud 1, 2
2 TAO - Machine Learning and Optimisation
LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
Abstract : This paper proposes a theoretical analysis of Genetic Programming (GP) from the perspective of statistical learning theory, a well grounded mathematical toolbox for machine learning. By computing the Vapnik-Chervonenkis dimension of the family of programs that can be inferred by a specific setting of GP, it is proved that a parsimonious fitness ensures universal consistency. This means that the empirical error minimization allows convergence to the best possible error when the number of test cases goes to infinity. However, it is also proved that the standard method consisting in putting a hard limit on the program size still results in programs of infinitely increasing size in function of their accuracy. It is also shown that cross-validation or hold-out for choosing the complexity level that optimizes the error rate in generalization also leads to bloat. So a more complicated modification of the fitness is proposed in order to avoid unnecessary bloat while nevertheless preserving universal consistency.
Type de document :
Communication dans un congrès
EuroGP, 2009, Tuebingen, Germany. Springer, 2009, Proceedings of EuroGP 09
Liste complète des métadonnées

Littérature citée [24 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00369782
Contributeur : Olivier Teytaud <>
Soumis le : samedi 21 mars 2009 - 09:09:26
Dernière modification le : jeudi 5 avril 2018 - 12:30:12
Document(s) archivé(s) le : jeudi 10 juin 2010 - 17:57:51

Fichier

eurogp.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00369782, version 1

Collections

Citation

Merve Amil, Nicolas Bredeche, Christian Gagné, Sylvain Gelly, Marc Schoenauer, et al.. A Statistical Learning Perspective of Genetic Programming}. EuroGP, 2009, Tuebingen, Germany. Springer, 2009, Proceedings of EuroGP 09. 〈inria-00369782〉

Partager

Métriques

Consultations de la notice

409

Téléchargements de fichiers

942