Universal Consistency and Bloat in GP

Sylvain Gelly 1 Olivier Teytaud 1 Nicolas Bredeche 1 Marc Schoenauer 1
1 TANC - Algorithmic number theory for cryptology
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], Inria Saclay - Ile de France
Abstract : In this paper, we provide an analysis of Genetic Programming (GP) from the Statistical Learning Theory viewpoint in the scope of symbolic regression. Firstly, we are interested in Universal Consistency, i.e. the fact that the solution minimizing the empirical error does converge to the best possible error when the number of examples goes to infinity, and secondly, we focus our attention on the uncontrolled growth of program length (i.e. bloat), which is a well-known problem in GP. Results show that (1) several kinds of code bloats may be identified and that (2) Universal consistency can be obtained as well as avoiding bloat under some con- ditions. We conclude by describing an ad hoc method that makes it possible simultaneously to avoid bloat and to ensure universal consistency.
Complete list of metadatas

Cited literature [18 references]  Display  Hide  Download

https://hal.inria.fr/inria-00112840
Contributor : Sylvain Gelly <>
Submitted on : Friday, November 10, 2006 - 10:31:29 AM
Last modification on : Wednesday, March 27, 2019 - 4:41:29 PM
Long-term archiving on : Thursday, September 20, 2012 - 2:35:49 PM

Identifiers

  • HAL Id : inria-00112840, version 1

Collections

Citation

Sylvain Gelly, Olivier Teytaud, Nicolas Bredeche, Marc Schoenauer. Universal Consistency and Bloat in GP. Revue des Sciences et Technologies de l'Information - Série RIA : Revue d'Intelligence Artificielle, Lavoisier, 2006. ⟨inria-00112840⟩

Share

Metrics

Record views

657

Files downloads

940