|
|
||
|---|---|---|
|
inria-00150207v1
Communication dans un congrès
Pierre-Arnaud Coquelin, Rémi Munos. Bandit Algorithms for Tree Search Uncertainty in Artificial Intelligence, 2007, Vancouver, Canada. 2007 |
||
|
hal-00750326v1
Communication dans un congrès
Rémi Coulom. CLOP: Confident Local Optimization for Noisy Black-Box Parameter Tuning van den Herik, H. Jaap and Plaat, Aske. Advances in Computer Games - 13th International Conference, Nov 2011, Tilburg, Netherlands. Springer, 7168, pp.146-157, 2012, Lecture Notes in Computer Science; Advances in Computer Games. <http://link.springer.com/chapter/10.1007%2F978-3-642-31866-5_13>. <10.1007/978-3-642-31866-5_13> |
||
|
hal-00981575v2
Communication dans un congrès
Tomáš Kocák, Michal Valko, Rémi Munos, Shipra Agrawal. Spectral Thompson Sampling AAAI Conference on Artificial Intelligence, Jul 2014, Québec City, Canada |
||
|
hal-00750298v1
Rapport
Amir Sani, Alessandro Lazaric, Rémi Munos. Risk-Aversion in Multi-armed Bandits [Research Report] 2012 |
||
|
hal-00654324v1
Chapitre d'ouvrage
Jean-Yves Audibert, Sébastien Bubeck, Rémi Munos. Bandit view on noisy optimization Optimization for Machine Learning, MIT Press, pp.431-454, 2011, 978-0-262-01646-9 |
||
|
hal-00654404v1
Communication dans un congrès
Jean-Yves Audibert, Sébastien Bubeck. Best Arm Identification in Multi-Armed Bandits COLT - 23th Conference on Learning Theory - 2010, Jun 2010, Haifa, Israel. 13 p., 2010 |
||
|
hal-01057562v1
Communication dans un congrès
Ronald Ortner, Odalric-Ambrym Maillard, Daniil Ryabko. Selecting Near-Optimal Approximate State Representations in Reinforcement Learning International Conference on Algorithmic Learning Theory (ALT), Oct 2014, Bled, Slovenia. Springer, 8776, pp.140-154, 2014, LNCS |
||
|
hal-01104739v1
Communication dans un congrès
Bilal Piot, Olivier Pietquin, Matthieu Geist. Predicting when to laugh with structured classification InterSpeech 2014, Sep 2014, Singapore, Singapore. Proceedings of the Annual Conference of the International Speech Communication Association, pp.1786-1790, 2014, <http://www.isca-speech.org/archive/archive_papers/interspeech_2014/i14_1786.pdf> |
||
|
hal-01104789v1
Communication dans un congrès
Bilal Piot, Matthieu Geist, Olivier Pietquin. Méthode de minimisation du résidu de Bellman boostée qui tient compte des démonstrations expertes. 9èmes Journées Francophones de Planification, Décision et Apprentissage (JFPDA'14), May 2014, Liège, Belgique. 2014 |
||
|
inria-00475214v1
Communication dans un congrès
Alessandro Lazaric, Mohammad Ghavamzadeh. Bayesian Multi-Task Reinforcement Learning ICML - 27th International Conference on Machine Learning, Jun 2010, Haifa, Israel. Omnipress, pp.599-606, 2010 |
||
|
hal-00823230v1
Communication dans un congrès
Phuong Nguyen, Odalric-Ambrym Maillard, Daniil Ryabko, Ronald Ortner. Competing with an Infinite Set of Models in Reinforcement Learning AISTATS, 2013, Arizona, United States. 31, pp.463-471, 2013, JMLR W&CP |
||
|
inria-00574999v1
Rapport
Odalric-Ambrym Maillard, Rémi Munos. Adaptive Bandits: Towards the best history-dependent strategy [Technical Report] 2011, pp.14 |
||
|
inria-00117266v3
Rapport
Sylvain Gelly, Yizao Wang, Rémi Munos, Olivier Teytaud. Modification of UCT with Patterns in Monte-Carlo Go [Research Report] RR-6062, INRIA. 2006 |
||
|
hal-00923685v1
Communication dans un congrès
Alexandra Carpentier, Rémi Munos. Toward optimal stratification for stratified monte-carlo integration International Conference on Machine Learning, 2013, United States. 2013 |
||
|
hal-00771128v1
Communication dans un congrès
Daniil Ryabko. ASYMPTOTIC STATISTICAL ANALYSIS OF STATIONARY ERGODIC TIME SERIES WITMSE 2012, Aug 2012, Amsterdam, Netherlands. 2012 |
||
|
hal-00923683v1
Communication dans un congrès
Nathaniel Korda, Emilie Kaufmann, Rémi Munos. Thompson sampling for one-dimensional exponential family bandits Advances in Neural Information Processing Systems, 2013, United States. 2013 |
||
|
hal-00823233v1
Communication dans un congrès
Daniil Ryabko. Time-series information and learning ISIT - International Symposium on Information Theory, 2013, Istanbul, Turkey. pp.1392-1395, 2013 |
||
|
hal-00923681v1
Communication dans un congrès
Gunnar Kedenburg, Raphael Fonteneau, Remi Munos. Aggregating optimistic planning trees for solving markov decision processes Advances in Neural Information Processing Systems, 2013, United States. pp.2382-2390, 2013 |
||
|
hal-00772046v1
Article dans une revue
Alessandro Lazaric, Rémi Munos. Learning with stochastic inputs and adversarial outputs Journal of Computer and System Sciences (JCSS), Elsevier, 2012, 78 (5), pp.1516-1537. <http://www.sciencedirect.com/science/article/pii/S002200001200027X> |
||
|
inria-00177155v1
Communication dans un congrès
Rémi Coulom. Monte-Carlo Tree Search in Crazy Stone Takeshi Ito and Akihiro Kishimoto. 12th Game Programming Workshop, Nov 2007, Hakone, Japan. 2007 |
||
|
hal-01077986v1
Autre publication
Frédéric Guillou, Romaric Gaudel, Jérémie Mary, Philippe Preux. User Engagement as Evaluation: a Ranking or a Regression Problem? 1. Introduction 2. Recsys Challenge 2014: Data and Protocol 2.1 Data Characteristics and St.. 2014, <10.1145/2668067.2668073> |
||
|
hal-00826051v1
Chapitre d'ouvrage
Delepoulle Samuel, François Rouselle, Renaud Christophe, Philippe Preux. A comparison of two machine learning approaches for Photometric Solids Compression Plemenos, Dimitri; Miaoulis, Georgios. Intelligent Computer Graphics, 321, Springer, pp.145-164, 2010, Studies in Computational Intelligence |
||
|
hal-00826055v1
Communication dans un congrès
Sertan Girgin, Philippe Preux. Basis Expansion in Natural Actor Critic Methods Girgin, Loth, Munos, Preux. European Workshop on Reinforcement Learning, Jun 2008, Villeneuve d'Ascq, France. Springer, 5323, pp.110-123, 2008, LNAI; Recent Advances in Reinforcement Learning |
||
|
hal-00826053v1
Chapitre d'ouvrage
Delepoulle Samuel, Renaud Christophe, Philippe Preux. Light Source Storage and Interpolation for Global Illumination: a neural solution Dimitri Plemenos, Georgios Miaoulis. Intelligent Computer Graphics, 240, Springer, pp.87-104, 2009, Studies in Computational Intelligence |
||
|
hal-00826054v1
Communication dans un congrès
Sertan Girgin, Philippe Preux. Basis Function Construction in Reinforcement Learning using Cascade-Correlation Learning Architecture International Conference on Machine Learning and Applications, Dec 2008, San Diego, United States. IEEE Press, pp.75-82, 2008, Proceedings of the International Conference on Machine Learning and Applications (ICML-A) |
||
|
hal-00826056v1
Communication dans un congrès
Sertan Girgin, Philippe Preux. Feature discovery in reinforcement learning using genetic programming 11th European Conference on Genetic Programming (EUROGP), 2008, Naples, Italy. Springer, 4971, pp.218-229, 2008, LNCS. <http://link.springer.com/chapter/10.1007%2F978-3-540-78671-9_19> |
||
|
hal-00772626v1
Chapitre d'ouvrage
Alessandro Lazaric. Transfer in Reinforcement Learning: a Framework and a Survey Marco Wiering, Martijn van Otterlo. Reinforcement Learning - State of the art, 12, Springer, pp.143-173, 2012, <10.1007/978-3-642-27645-3_5> |
||
|
hal-00772615v1
Communication dans un congrès
Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric. Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence NIPS - Twenty-Sixth Annual Conference on Neural Information Processing Systems, Dec 2012, Lake Tahoe, United States. 2012 |
||
|
inria-00329797v1
Communication dans un congrès
Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvari. Online Optimization in X-Armed Bandits Twenty-Second Annual Conference on Neural Information Processing Systems, Dec 2008, Vancouver, Canada. 2008 |
||
|
inria-00124833v1
Communication dans un congrès
Andras Antos, Csaba Szepesvari, Rémi Munos. Value-Iteration Based Fitted Policy Iteration: Learning with a Single Trajectory IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, 2007, Hawai, United States. pp.2007, 2007 |
||
|
|
||