E. Altman, T. Boulogne, R. Azouzi, T. Jiménez, and L. Wynter, A survey on networking games in telecommunications, Computers & Operations Research, vol.33, issue.2, pp.286-311, 2006.
DOI : 10.1016/j.cor.2004.06.005

F. Alvarez, J. Bolte, and O. Brahic, Hessian Riemannian Gradient Flows in Convex Programming, SIAM Journal on Control and Optimization, vol.43, issue.2, pp.477-501, 2004.
DOI : 10.1137/S0363012902419977

M. Benaïm, Dynamics of stochastic approximation algorithms, 1999.
DOI : 10.1007/978-1-4757-1947-5

J. Björnerstedt and J. W. , Nash equilibrium and evolution by imitation. The Rational Foundations of Economic Behavior, pp.155-181, 1996.

T. Börgers and R. Sarin, Learning Through Reinforcement and Replicator Dynamics, Journal of Economic Theory, vol.77, issue.1, pp.1-14, 1997.
DOI : 10.1006/jeth.1997.2319

V. S. Borkar, Stochastic approximation, Resonance, vol.8, issue.s.471012, 2008.
DOI : 10.1007/s12045-013-0136-x

M. Bravo, 2011: An adjusted payoff-based procedure for normal form games

A. Cabrales, Stochastic Replicator Dynamics, International Economic Review, vol.41, issue.2, pp.451-81, 2000.
DOI : 10.1111/1468-2354.00071

R. Cominetti, E. Melo, and S. Sorin, A payoff-based learning procedure and its application to traffic games, Games and Economic Behavior, vol.70, issue.1, pp.71-83
DOI : 10.1016/j.geb.2008.11.012

D. Fudenberg and C. Harris, Evolutionary dynamics with aggregate shocks, Journal of Economic Theory, vol.57, issue.2, pp.420-441, 1992.
DOI : 10.1016/0022-0531(92)90044-I

D. Fudenberg and D. K. Levine, The Theory of Learning in Games, Economic learning and social evolution, 1998.

S. Hart and A. Mas, A Simple Adaptive Procedure Leading to Correlated Equilibrium, Econometrica, vol.68, issue.5, pp.1127-1150, 2000.
DOI : 10.1111/1468-0262.00153

S. Hart and A. Mas, A reinforcement procedure leading to correlated equilibrium . Economic Essays, pp.181-200, 2001.

J. Hofbauer and W. H. Sandholm, On the Global Convergence of Stochastic Fictitious Play, Econometrica, vol.70, issue.6, pp.2265-2294, 2002.
DOI : 10.1111/1468-0262.00376

J. Hofbauer and K. Sigmund, Evolutionary Games and Population Dynamics, 1998.
DOI : 10.1017/CBO9781139173179

J. Hofbauer, S. Sorin, and Y. Viossat, Time Average Replicator and Best-Reply Dynamics, Mathematics of Operations Research, vol.34, issue.2, pp.263-269, 2009.
DOI : 10.1287/moor.1080.0359
URL : https://hal.archives-ouvertes.fr/hal-00360767

E. Hopkins, Two Competing Models of How People Learn in Games, Econometrica, vol.70, issue.6, pp.2141-2166, 2002.
DOI : 10.1111/1468-0262.00372

E. Hopkins and M. Posch, Attainability of boundary points under reinforcement learning, Games and Economic Behavior, vol.53, issue.1, pp.110-125, 2004.
DOI : 10.1016/j.geb.2004.08.002

D. Lamberton, G. Pagès, and P. Tarrès, When can the two-armed bandit algorithm be trusted? The Annals of Applied Probability, pp.1424-1454, 2004.

R. Laraki and P. Mertikopoulos, Higher order game dynamics, Journal of Economic Theory, vol.148, issue.6, pp.2666-2695
DOI : 10.1016/j.jet.2013.08.002
URL : https://hal.archives-ouvertes.fr/hal-01382303

R. Laraki and P. Mertikopoulos, Inertial Game Dynamics and Applications to Constrained Optimization, SIAM Journal on Control and Optimization, vol.53, issue.5, 2013.
DOI : 10.1137/130920253
URL : https://hal.archives-ouvertes.fr/hal-01382295

J. M. Lee, Introduction to Smooth Manifolds. No. 218 in Graduate Texts in Mathematics, 2003.

D. S. Leslie, Reinforcement learning in games, 2004.

D. S. Leslie and E. J. Collins, -Learning in Normal Form Games, SIAM Journal on Control and Optimization, vol.44, issue.2, pp.495-514, 2005.
DOI : 10.1137/S0363012903437976
URL : https://hal.archives-ouvertes.fr/hal-00959139

M. Marsili, D. Challet, and R. Zecchina, Exact solution of a modified El Farol's bar problem: Efficiency and the role of market impact, Physica A: Statistical Mechanics and its Applications, vol.280, issue.3-4, pp.522-553, 2000.
DOI : 10.1016/S0378-4371(99)00610-X

D. L. Mcfadden, The measurement of urban travel demand, Journal of Public Economics, vol.3, issue.4, pp.303-328, 1974.
DOI : 10.1016/0047-2727(74)90003-6

R. D. Mckelvey and T. R. Palfrey, Quantal Response Equilibria for Normal Form Games, Games and Economic Behavior, vol.10, issue.1, pp.6-38, 1995.
DOI : 10.1006/game.1995.1023

P. Mertikopoulos, E. V. Belmega, and A. L. Moustakas, Matrix exponential learning: Distributed optimization in MIMO systems, 2012 IEEE International Symposium on Information Theory Proceedings, 2012.
DOI : 10.1109/ISIT.2012.6284117
URL : https://hal.archives-ouvertes.fr/hal-00741823

P. Mertikopoulos and A. L. Moustakas, The emergence of rational behavior in the presence of stochastic perturbations, The Annals of Applied Probability, vol.20, issue.4, pp.1359-1388, 2010.
DOI : 10.1214/09-AAP651
URL : https://hal.archives-ouvertes.fr/hal-01382306

D. Monderer and L. S. Shapley, Potential Games, Games and Economic Behavior, vol.14, issue.1, pp.124-143, 1996.
DOI : 10.1006/game.1996.0044

K. Ritzberger and J. W. , Evolutionary Selection in Normal-Form Games, Econometrica, vol.63, issue.6, pp.1371-99, 1995.
DOI : 10.2307/2171774

R. T. Rockafellar, Convex Analysis, 1970.
DOI : 10.1515/9781400873173

A. Rustichini, Optimal Properties of Stimulus???Response Learning Models, Games and Economic Behavior, vol.29, issue.1-2, pp.230-244, 1999.
DOI : 10.1006/game.1999.0712

W. H. Sandholm, Population Games and Evolutionary Dynamics. Economic learning and social evolution, 2010.

P. S. Sastry, V. V. Phansalkar, and M. A. Thathachar, Decentralized learning of Nash equilibria in multi-person stochastic games with incomplete information, IEEE Transactions on Systems, Man, and Cybernetics, vol.24, issue.5, pp.769-777, 1994.
DOI : 10.1109/21.293490

T. Sharia, Truncated stochastic approximation with moving bounds: convergence. ArXiv e-prints, 2011.

S. Sorin, Exponential weight algorithm in continuous time, Mathematical Programming, pp.513-528, 2009.
DOI : 10.1007/s10107-007-0111-y

P. D. Taylor and L. B. Jonker, Evolutionary stable strategies and game dynamics, Mathematical Biosciences, vol.40, issue.1-2, pp.145-156, 1978.
DOI : 10.1016/0025-5564(78)90077-9

K. Tuyls, P. J. Hoen, and B. Vanschoenwinkel, An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games, Autonomous Agents and Multi-Agent Systems, vol.8, issue.6, pp.115-153, 2006.
DOI : 10.1007/s10458-005-3783-9

E. Van-damme, Stability and perfection of Nash equilibria, 1987.

J. W. Weibull, Evolutionary Game Theory, 1995.

H. P. Young, Learning by trial and error, Games and Economic Behavior, vol.65, issue.2, pp.626-643, 2009.
DOI : 10.1016/j.geb.2008.02.011

. Univ, . Versailles, F. Prism, and . Versailles, France E-mail address: pierre.coucheney@uvsq.fr URL: http://www.prism.uvsq.fr/users/pico/index.html Inria, Univ. Grenoble Alpes, LIG, F-38000 Grenoble, France E-mail address: bruno.gaujal@inria.fr URL: http://mescal.imag.fr/membres/bruno.gaujal CNRS (French National Center for Scientific Research)