A Statistical Learning Theory Approach of Bloat

Olivier Teytaud; Sylvain Gelly; Nicolas Bredeche; Marc Schoenauer

Communication Dans Un Congrès Année : 2005

A Statistical Learning Theory Approach of Bloat

(1) , (1) , (1) , (1)

Olivier Teytaud

Fonction : Auteur
PersonId : 581
IdHAL : olivier-teytaud
IdRef : 05971008X

Algorithmic number theory for cryptology

Sylvain Gelly

Fonction : Auteur

Algorithmic number theory for cryptology

Nicolas Bredeche

Fonction : Auteur
PersonId : 184446
IdHAL : nicolas-bredeche
ORCID : 0000-0002-8241-7461
IdRef : 070019452

Algorithmic number theory for cryptology

Marc Schoenauer

Fonction : Auteur
PersonId : 739309
IdHAL : evomarc
ORCID : 0000-0003-1450-6830
IdRef : 057775575

Algorithmic number theory for cryptology

Résumé

Code bloat, the excessive increase of code size, is an important is- sue in Genetic Programming (GP). This paper proposes a theoreti- cal analysis of code bloat in the framework of symbolic regression in GP, from the viewpoint of Statistical Learning Theory, a well grounded mathematical toolbox for Machine Learning. Two kinds of bloat must be distinguished in that context, depending whether the target function lies in the search space or not. Then, important mathematical results are proved using classical results from Sta- tistical Learning. Namely, the Vapnik-Cervonenkis dimension of programs is computed, and further results from Statistical Learn- ing allow to prove that a parsimonious fitness ensures Universal Consistency (the solution minimizing the empirical error does con- verge to the best possible error when the number of samples goes to infinity). However, it is proved that the standard method consisting in choosing a maximal program size depending on the number of samples might still result in programs of infinitely increasing size whith their accuracy; a more complicated modification of the fit- ness is proposed that theoretically avoids unnecessary bloat while nevertheless preserving the Universal Consistency.

Domaines

Apprentissage [cs.LG]

Fichier principal

antibloatGecco2005_long_version.pdf (104.63 Ko)

Sylvain Gelly : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00000549

Soumis le : mercredi 2 novembre 2005-11:05:13

Dernière modification le : vendredi 24 mars 2023-14:52:47

Archivage à long terme le : mardi 11 septembre 2012-12:40:44

Dates et versions

inria-00000549 , version 1 (02-11-2005)

Identifiants

HAL Id : inria-00000549 , version 1

Citer

Olivier Teytaud, Sylvain Gelly, Nicolas Bredeche, Marc Schoenauer. A Statistical Learning Theory Approach of Bloat. Genetic and Evolutionary Computation Conference, Jun 2005, Washington D.C. USA. ⟨inria-00000549⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

X CNRS INRIA LIX X-LIX X-DEP-INFO PARISTECH INRIA2

173 Consultations

183 Téléchargements

A Statistical Learning Theory Approach of Bloat

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager