Combiner connaissances expertes, hors-ligne, transientes et en ligne pour l'exploration Monte-Carlo

Louis Chatriot; Christophe Fiter; Guillaume Chaslot; Sylvain Gelly; Jean-Baptiste Hoock; J. Perez; Arpad Rimmel; Olivier Teytaud

Journal Articles Revue des Sciences et Technologies de l'Information - Série RIA : Revue d'Intelligence Artificielle Year : 2008

Combiner connaissances expertes, hors-ligne, transientes et en ligne pour l'exploration Monte-Carlo

(1) , (1) , (2) , (1, 3, 4) , (1) , (1) , (1) , (1, 3, 4)

1
2
3
4

Louis Chatriot

Function : Author

Machine Learning and Optimisation

Christophe Fiter

Function : Author
PersonId : 17255
IdHAL : christophe-fiter
ORCID : 0000-0002-7360-7415
IdRef : 166685917

Machine Learning and Optimisation

Guillaume Chaslot

Function : Author

Maastricht University [Maastricht]

Sylvain Gelly

Function : Author

Machine Learning and Optimisation

Algorithmic number theory for cryptology

Laboratoire de Recherche en Informatique

Jean-Baptiste Hoock

Function : Author

Machine Learning and Optimisation

J. Perez

Function : Author

Machine Learning and Optimisation

Arpad Rimmel

Function : Author
PersonId : 18807
IdHAL : arpad-rimmel
IdRef : 140527273

Machine Learning and Optimisation

Olivier Teytaud

Function : Author
PersonId : 581
IdHAL : olivier-teytaud
IdRef : 05971008X

Machine Learning and Optimisation

Algorithmic number theory for cryptology

Laboratoire de Recherche en Informatique

Abstract

Nous combinons pour de l'exploration Monte-Carlo d'arbres de l'apprentissage arti- RÉSUMÉ. ﬁciel à 4 échelles de temps : – regret en ligne, via l'utilisation d'algorithmes de bandit et d'estimateurs Monte-Carlo ; – de l'apprentissage transient, via l'utilisation d'estimateur rapide de Q-fonction (RAVE, pour Rapid Action Value Estimate) qui sont appris en ligne et utilisés pour accélérer l'explora- tion mais sont ensuite peu à peu laissés de côté à mesure que des informations plus ﬁnes sont disponibles ; – apprentissage hors-ligne, par fouille de données de jeux ; – utilisation de connaissances expertes comme information a priori. L'algorithme obtenu est plus fort que chaque élément séparément. Nous mettons en évidence par ailleurs un dilemne exploration-exploitation dans l'exploration Monte-Carlo d'arbres et obtenons une très forte amélioration par calage des paramètres correspondant. We combine for Monte-Carlo exploration machine learning at four different time ABSTRACT. scales: – online regret, through the use of bandit algorithms and Monte-Carlo estimates; – transient learning, through the use of rapid action value estimates (RAVE) which are learnt online and used for accelerating the exploration and are thereafter neglected; – ofﬂine learning, by data mining of datasets of games; – use of expert knowledge coming from the old ages as prior information.

Keywords

computer-go transient learning expert knowledge offline learning online learning Monte-Carlo Tree Search UCT

Domains

Optimization and Control [math.OC] Machine Learning [cs.LG]

Fichier principal

eg_french.pdf (312.71 Ko)

Origin : Files produced by the author(s)

Olivier Teytaud : Connect in order to contact the contributor

https://inria.hal.science/inria-00343509

Submitted on : Monday, December 1, 2008-5:40:09 PM

Last modification on : Thursday, April 18, 2024-4:33:54 PM

Long-term archiving on: Monday, June 7, 2010-10:13:16 PM

Dates and versions

inria-00343509 , version 1 (01-12-2008)

Identifiers

HAL Id : inria-00343509 , version 1

Cite

Louis Chatriot, Christophe Fiter, Guillaume Chaslot, Sylvain Gelly, Jean-Baptiste Hoock, et al.. Combiner connaissances expertes, hors-ligne, transientes et en ligne pour l'exploration Monte-Carlo. Revue des Sciences et Technologies de l'Information - Série RIA : Revue d'Intelligence Artificielle, 2008. ⟨inria-00343509⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

X EC-PARIS CNRS INRIA LIX X-LIX X-DEP-INFO UMR8623 INRIA2 LRI-AO TDS-MACS UNIV-PARIS-SACLAY

258 View

188 Download

Combiner connaissances expertes, hors-ligne, transientes et en ligne pour l'exploration Monte-Carlo

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Share