630 résultats  enregistrer la recherche


...
inria-00150207v1  Communication dans un congrès
Pierre-Arnaud CoquelinRémi MunosBandit Algorithms for Tree Search
Uncertainty in Artificial Intelligence, 2007, Vancouver, Canada. 2007
hal-00750326v1  Communication dans un congrès
Rémi CoulomCLOP: Confident Local Optimization for Noisy Black-Box Parameter Tuning
van den Herik, H. Jaap and Plaat, Aske. Advances in Computer Games - 13th International Conference, Nov 2011, Tilburg, Netherlands. Springer, 7168, pp.146-157, 2012, Lecture Notes in Computer Science; Advances in Computer Games. <http://link.springer.com/chapter/10.1007%2F978-3-642-31866-5_13>. <10.1007/978-3-642-31866-5_13>
...
hal-00981575v2  Communication dans un congrès
Tomáš KocákMichal ValkoRémi MunosShipra AgrawalSpectral Thompson Sampling
AAAI Conference on Artificial Intelligence, Jul 2014, Québec City, Canada
hal-00654324v1  Chapitre d'ouvrage
Jean-Yves AudibertSébastien BubeckRémi MunosBandit view on noisy optimization
Optimization for Machine Learning, MIT Press, pp.431-454, 2011, 978-0-262-01646-9
...
hal-00654404v1  Communication dans un congrès
Jean-Yves AudibertSébastien BubeckBest Arm Identification in Multi-Armed Bandits
COLT - 23th Conference on Learning Theory - 2010, Jun 2010, Haifa, Israel. 13 p., 2010
hal-01057562v1  Communication dans un congrès
Ronald OrtnerOdalric-Ambrym MaillardDaniil RyabkoSelecting Near-Optimal Approximate State Representations in Reinforcement Learning
International Conference on Algorithmic Learning Theory (ALT), Oct 2014, Bled, Slovenia. Springer, 8776, pp.140-154, 2014, LNCS
...
hal-01104739v1  Communication dans un congrès
Bilal PiotOlivier PietquinMatthieu GeistPredicting when to laugh with structured classification
InterSpeech 2014, Sep 2014, Singapore, Singapore. Proceedings of the Annual Conference of the International Speech Communication Association, pp.1786-1790, 2014, <http://www.isca-speech.org/archive/archive_papers/interspeech_2014/i14_1786.pdf>
hal-01104789v1  Communication dans un congrès
Bilal PiotMatthieu GeistOlivier PietquinMéthode de minimisation du résidu de Bellman boostée qui tient compte des démonstrations expertes.
9èmes Journées Francophones de Planification, Décision et Apprentissage (JFPDA'14), May 2014, Liège, Belgique. 2014
...
inria-00475214v1  Communication dans un congrès
Alessandro LazaricMohammad GhavamzadehBayesian Multi-Task Reinforcement Learning
ICML - 27th International Conference on Machine Learning, Jun 2010, Haifa, Israel. Omnipress, pp.599-606, 2010
hal-00823230v1  Communication dans un congrès
Phuong NguyenOdalric-Ambrym MaillardDaniil RyabkoRonald OrtnerCompeting with an Infinite Set of Models in Reinforcement Learning
AISTATS, 2013, Arizona, United States. 31, pp.463-471, 2013, JMLR W&CP
...
hal-00923685v1  Communication dans un congrès
Alexandra CarpentierRémi MunosToward optimal stratification for stratified monte-carlo integration
International Conference on Machine Learning, 2013, United States. 2013
...
hal-00771128v1  Communication dans un congrès
Daniil RyabkoASYMPTOTIC STATISTICAL ANALYSIS OF STATIONARY ERGODIC TIME SERIES
WITMSE 2012, Aug 2012, Amsterdam, Netherlands. 2012
...
hal-00923683v1  Communication dans un congrès
Nathaniel KordaEmilie KaufmannRémi MunosThompson sampling for one-dimensional exponential family bandits
Advances in Neural Information Processing Systems, 2013, United States. 2013
hal-00823233v1  Communication dans un congrès
Daniil RyabkoTime-series information and learning
ISIT - International Symposium on Information Theory, 2013, Istanbul, Turkey. pp.1392-1395, 2013
...
hal-00923681v1  Communication dans un congrès
Gunnar KedenburgRaphael FonteneauRemi MunosAggregating optimistic planning trees for solving markov decision processes
Advances in Neural Information Processing Systems, 2013, United States. pp.2382-2390, 2013
...
hal-00772046v1  Article dans une revue
Alessandro LazaricRémi MunosLearning with stochastic inputs and adversarial outputs
Journal of Computer and System Sciences (JCSS), Elsevier, 2012, 78 (5), pp.1516-1537. <http://www.sciencedirect.com/science/article/pii/S002200001200027X>
inria-00177155v1  Communication dans un congrès
Rémi CoulomMonte-Carlo Tree Search in Crazy Stone
Takeshi Ito and Akihiro Kishimoto. 12th Game Programming Workshop, Nov 2007, Hakone, Japan. 2007
...
hal-01077986v1  Autre publication
Frédéric GuillouRomaric GaudelJérémie MaryPhilippe PreuxUser Engagement as Evaluation: a Ranking or a Regression Problem?
1. Introduction 2. Recsys Challenge 2014: Data and Protocol 2.1 Data Characteristics and St.. 2014, <10.1145/2668067.2668073>
hal-00826051v1  Chapitre d'ouvrage
Delepoulle SamuelFrançois RouselleRenaud ChristophePhilippe PreuxA comparison of two machine learning approaches for Photometric Solids Compression
Plemenos, Dimitri; Miaoulis, Georgios. Intelligent Computer Graphics, 321, Springer, pp.145-164, 2010, Studies in Computational Intelligence
...
hal-00826055v1  Communication dans un congrès
Sertan GirginPhilippe PreuxBasis Expansion in Natural Actor Critic Methods
Girgin, Loth, Munos, Preux. European Workshop on Reinforcement Learning, Jun 2008, Villeneuve d'Ascq, France. Springer, 5323, pp.110-123, 2008, LNAI; Recent Advances in Reinforcement Learning
hal-00826053v1  Chapitre d'ouvrage
Delepoulle SamuelRenaud ChristophePhilippe PreuxLight Source Storage and Interpolation for Global Illumination: a neural solution
Dimitri Plemenos, Georgios Miaoulis. Intelligent Computer Graphics, 240, Springer, pp.87-104, 2009, Studies in Computational Intelligence
...
hal-00826054v1  Communication dans un congrès
Sertan GirginPhilippe PreuxBasis Function Construction in Reinforcement Learning using Cascade-Correlation Learning Architecture
International Conference on Machine Learning and Applications, Dec 2008, San Diego, United States. IEEE Press, pp.75-82, 2008, Proceedings of the International Conference on Machine Learning and Applications (ICML-A)
...
hal-00826056v1  Communication dans un congrès
Sertan GirginPhilippe PreuxFeature discovery in reinforcement learning using genetic programming
11th European Conference on Genetic Programming (EUROGP), 2008, Naples, Italy. Springer, 4971, pp.218-229, 2008, LNCS. <http://link.springer.com/chapter/10.1007%2F978-3-540-78671-9_19>
...
hal-00772626v1  Chapitre d'ouvrage
Alessandro LazaricTransfer in Reinforcement Learning: a Framework and a Survey
Marco Wiering, Martijn van Otterlo. Reinforcement Learning - State of the art, 12, Springer, pp.143-173, 2012, <10.1007/978-3-642-27645-3_5>
...
hal-00772615v1  Communication dans un congrès
Victor GabillonMohammad GhavamzadehAlessandro LazaricBest Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence
NIPS - Twenty-Sixth Annual Conference on Neural Information Processing Systems, Dec 2012, Lake Tahoe, United States. 2012
...
inria-00329797v1  Communication dans un congrès
Sébastien BubeckRémi MunosGilles StoltzCsaba SzepesvariOnline Optimization in X-Armed Bandits
Twenty-Second Annual Conference on Neural Information Processing Systems, Dec 2008, Vancouver, Canada. 2008
...
inria-00124833v1  Communication dans un congrès
Andras AntosCsaba SzepesvariRémi MunosValue-Iteration Based Fitted Policy Iteration: Learning with a Single Trajectory
IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, 2007, Hawai, United States. pp.2007, 2007