An Approximate Dynamic Programming Approach to Repeated Games with Vector Losses - Archive ouverte HAL Access content directly
Journal Articles Operations Research Year : 2022

An Approximate Dynamic Programming Approach to Repeated Games with Vector Losses

(1) , (2) , (3)
1
2
3

Abstract

We describe an approximate dynamic programming (ADP) approach to compute approximations of the optimal strategies and of the minimal losses that can be guaranteed in discounted repeated games with vector-valued losses. Among other applications, such vector-valued games prominently arise in the analysis of worst-case regret in repeated decision making in unknown environments, also known as the adversarial online learning framework. At the core of our approach is a characterization of the lower Pareto frontier of the set of expected losses that a player can guarantee in these games as the unique fixed point of a set-valued dynamic programming operator. When applied to the problem of worst-case regret minimization with discounted losses, our approach yields algorithms that achieve markedly improved performance bounds compared with off-the-shelf online learning algorithms like Hedge. These results thus suggest the significant potential of ADP-based approaches in adversarial online learning.
Fichier principal
Vignette du fichier
Kamble-etal_ApproxDP-repeated-games-vector-losses_OR22-toappear.pdf (3.05 Mo) Télécharger le fichier
Vignette du fichier
Kamble-etal_ApproxDP-repeated-games-vector-losses_OR22-toappear-appendix.pdf (2.89 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03951270 , version 1 (23-01-2023)

Identifiers

Cite

Vijay Kamble, Patrick Loiseau, Jean Walrand. An Approximate Dynamic Programming Approach to Repeated Games with Vector Losses. Operations Research, 2022, ⟨10.1287/opre.2022.2334⟩. ⟨hal-03951270⟩
0 View
0 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More