F. Alvarez, J. Bolte, and O. Brahic, Hessian Riemannian gradient flows in convex programming, SIAM Journal on Control and Optimization, vol.43, issue.2, pp.477-501, 2004.
URL : https://hal.archives-ouvertes.fr/hal-01928141

M. Arjovsky, S. Chintala, and L. Bottou, Wasserstein generative adversarial networks, Proceedings of the 34th International Conference on Machine Learning, pp.214-223, 2017.

S. Arora, E. Hazan, and S. Kale, The multiplicative weights update method: A meta-algorithm and applications, Theory of Computing, vol.8, pp.121-164, 2012.

P. Auer, N. Cesa-bianchi, Y. Freund, and R. E. Schapire, Gambling in a rigged casino: The adversarial multi-armed bandit problem, Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 1995.

P. James, G. Bailey, and . Piliouras, Multiplicative weights update in zero-sum games, Proceedings of the 2018 ACM Conference on Economics and Computation, pp.321-338, 2018.

P. James, G. Bailey, and . Piliouras, Multi-agent learning in network zero-sum games is a Hamiltonian system, Int. Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2019.

D. Balduzzi, S. Racanière, J. Martens, J. N. Foerster, K. Tuyls et al., The mechanics of n-player differentiable games, ICML, 2018.

H. Heinz, P. L. Bauschke, and . Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2017.

S. Bhojanapalli, B. Neyshabur, and N. Srebro, Global optimality of local search for low rank matrix recovery, Advances in Neural Information Processing Systems, pp.3873-3881, 2016.

L. Bottou, Online learning and stochastic approximations, vol.17, p.142, 1998.

M. Bravo, D. S. Leslie, and P. Mertikopoulos, Bandit learning in concave N-person games, NIPS '18: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01891523

L. M. Bregman, The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming, USSR Computational Mathematics and Mathematical Physics, vol.7, issue.3, pp.200-217, 1967.

S. Bubeck, Convex optimization: Algorithms and complexity. Foundations and Trends in Machine Learning, vol.8, pp.231-358, 2015.

G. Chen and M. Teboulle, Convergence analysis of a proximal-like minimization algorithm using Bregman functions, SIAM Journal on Optimization, vol.3, issue.3, pp.538-543, 1993.

J. Cohen, A. Héliou, and P. Mertikopoulos, Learning with bandit feedback in potential games, NIPS '17: Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01643352

C. Daskalakis, A. Ilyas, V. Syrgkanis, and H. Zeng, Training GANs with optimism, ICLR '18: Proceedings of the 2018 International Conference on Learning Representations, 2018.

F. Facchinei and J. Pang, Finite-Dimensional Variational Inequalities and Complementarity Problems. Springer Series in Operations Research, 2003.

Y. Freund and R. E. Schapire, Adaptive game playing using multiplicative weights, Games and Economic Behavior, vol.29, pp.79-103, 1999.

D. Fudenberg and D. K. Levine, of Economic learning and social evolution, vol.2, 1998.

R. Ge, F. Huang, C. Jin, and Y. Yuan, Escaping from saddle points online stochastic gradient for tensor decomposition, Conference on Learning Theory, pp.797-842, 2015.

R. Ge, C. Jin, and Y. Zheng, No spurious local minima in nonconvex low rank problems: A unified geometric analysis, 2017.

G. Gidel, H. Berard, P. Vincent, and S. Lacoste-julien, A variational inequality perspective on generative adversarial networks, 2018.

I. Goodfellow, J. Pouget-abadie, M. Mirza, B. Xu, D. Warde-farley et al., Generative adversarial nets, Advances in neural information processing systems, pp.2672-2680, 2014.

I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville, Improved training of wasserstein gans, Advances in Neural Information Processing Systems, vol.30, pp.5769-5779, 2017.

P. Hall and C. C. Heyde, Martingale Limit Theory and Its Application. Probability and Mathematical Statistics, 1980.

A. Juditsky, C. Arkadi-semen-nemirovski, and . Tauvel, Solving variational inequalities with stochastic mirror-prox algorithm, Stochastic Systems, vol.1, issue.1, pp.17-58, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00318043

D. Kingma and J. Ba, Adam: A method for stochastic optimization, vol.12, 2014.

G. M. Korpelevich, The extragradient method for finding saddle points and other problems, Èkonom. i Mat. Metody, vol.12, pp.747-756, 1976.

M. Jason-d-lee, . Simchowitz, B. Michael-i-jordan, and . Recht, Gradient descent only converges to minimizers, Conference on Learning Theory, pp.1246-1257, 2016.

I. Jason-d-lee, G. Panageas, M. Piliouras, . Simchowitz, B. Michael-i-jordan et al., First-order methods almost always avoid strict saddle points, Mathematical Programming, 2019.

P. Mertikopoulos and M. Staudigl, On the convergence of gradient-like flows with noisy gradient input, SIAM Journal on Optimization, vol.28, issue.1, pp.163-197, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01404586

P. Mertikopoulos, C. H. Papadimitriou, and G. Piliouras, Cycles in adversarial regularized learning, SODA '18: Proceedings of the 29th annual ACM-SIAM Symposium on Discrete Algorithms, 2018. (a) Vanilla versus optimistic Adam training in the CelebA dataset
URL : https://hal.archives-ouvertes.fr/hal-01643338

, Vanilla versus optimistic Adam training in the CIFAR-10 dataset (left and right respectively)

, GAN training with and without an extra-gradient step in the CelebA and CIFAR-10 datasets. Table 2: Image experiments settings = 'WGAN-GP' Optimizer = 'extra-Adam' or, vol.6