Multi-Agent Reinforcement Learning Algorithms, 2010. ,
On the Generation of Markov Decision Processes, Journal of the Operational Research Society, vol.46, issue.3, pp.354-361, 1995. ,
DOI : 10.1057/jors.1995.50
Residual Algorithms: Reinforcement Learning with Function Approximation, Proc. of ICML, 1995. ,
DOI : 10.1016/B978-1-55860-377-6.50013-X
Competitive Markov Decision Processes, 2012. ,
DOI : 10.1007/978-1-4612-4054-9
Deep Learning. Book in preparation for, 2016. ,
Modelling Transition Dynamics in MDPs With RKHS Embeddings, Proc. of ICML, 2012. ,
Nash Q-Learning for General-Sum Stochastic Games, Journal of Machine Learning Research, vol.4, pp.1039-1069, 2003. ,
Value Function Approximation in Zero-Sum Markov Games Reinforcement Learning as Classification: Leveraging Modern Classifiers, Proc. of UAI. [Lagoudakis and Parr Proc. of ICML, 2002. ,
Deep learning, Nature, vol.9, issue.7553, pp.436-444, 2015. ,
DOI : 10.1007/s10994-013-5335-x
Continuous Control with Deep Reinforcement Learning, Proc. of ICLR, 2016. ,
Friend-or-Foe Q-Learning in General-Sum Games, Proc. of ICML, 2001. ,
Finite- Sample Analysis of Bellman Residual Minimization, Proc. of ACML, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00830212
Human-level control through deep reinforcement learning, Nature, vol.101, issue.7540, pp.529-533, 2015. ,
DOI : 10.1016/S0004-3702(98)00023-X
Finite-Time Bounds for Fitted Value Iteration, The Journal of Machine Learning Research, vol.9, pp.815-857, 2008. ,
URL : https://hal.archives-ouvertes.fr/inria-00120882
Algorithmic Game Theory, 2007. ,
DOI : 10.1017/CBO9780511800481
Softened Approximate Policy Iteration for Markov Games, Proc. of ICML, 2016. ,
On the use of nonstationary strategies for solving two-player zero-sum markov games, Proc. of AISTATS, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01291495
Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games, Proc. of ICML, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01153270
Learning Nash Equilibrium for General-Sum Markov Games from Batch Data Boosted Bellman Residual Minimization Handling Expert Demonstrations, Proc. of ECML, 2014. ,
DOI : 10.1007/978-3-662-44851-9_35
URL : https://hal-supelec.archives-ouvertes.fr/hal-01060953/document/
Difference of convex functions programming for reinforcement learning, Proc. of NIPS, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01104419
Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games, Proc. of AAMAS, 2015. ,
Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1994. ,
DOI : 10.1002/9780470316887
Approximate Modified Policy Iteration, Proc. of ICML, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00758882
Stochastic Games, Proc. of the National Academy of Sciences of the United States of America, 1953. ,
Value Function Approximation in Noisy Environments Using Locally Smoothed Regularized Approximate Linear Programs, Proc. of UAI, 2012. ,
Cyclic Equilibria in Markov Games, Proc. of NIPS, 2006. ,