Genetic Programming, Validation Sets, and Parsimony Pressure

Christian Gagné 1, 2, 3 Marc Schoenauer 1 Marc Parizeau 2 Marco Tomassini 3
1 TANC - Algorithmic number theory for cryptology
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], Inria Saclay - Ile de France, Polytechnique - X, CNRS - Centre National de la Recherche Scientifique : UMR7161
Abstract : Fitness functions based on test cases are very common in Genetic Programming (GP). This process can be assimilated to a learning task, with the inference of models from a limited number of samples. This paper is an investigation on two methods to improve generalization in GP-based learning: 1) the selection of the best-of-run individuals using a three data sets methodology, and 2) the application of parsimony pressure in order to reduce the complexity of the solutions. Results using GP in a binary classification setup show that while the accuracy on the test sets is preserved, with less variances compared to baseline results, the mean tree size obtained with the tested methods is significantly reduced.
Document type :
Conference papers
Collet, Tomassini, Ebner, Ekart and Gustafson. EuroGP 2006, Apr 2006, Budapest, Hongrie, Springer Verlag, 2006, LNCS
Liste complète des métadonnées

Cited literature [26 references]  Display  Hide  Download

https://hal.inria.fr/inria-00000996
Contributor : Christian Gagné <>
Submitted on : Wednesday, January 11, 2006 - 4:08:48 PM
Last modification on : Thursday, February 9, 2017 - 3:07:31 PM
Document(s) archivé(s) le : Saturday, April 3, 2010 - 9:12:13 PM

Identifiers

Collections

Citation

Christian Gagné, Marc Schoenauer, Marc Parizeau, Marco Tomassini. Genetic Programming, Validation Sets, and Parsimony Pressure. Collet, Tomassini, Ebner, Ekart and Gustafson. EuroGP 2006, Apr 2006, Budapest, Hongrie, Springer Verlag, 2006, LNCS. 〈inria-00000996〉

Share

Metrics

Record views

388

Files downloads

853