Cooperative Markov decision processes: time consistency, greedy players satisfaction, and cooperation maintenance

Abstract : We deal with multi-agent Markov decision processes (MDPs) in which cooperation among players is allowed. We find a cooperative payoff distribution procedure (MDP-CPDP) that distributes in the course of the game the payoff that players would earn in the long run game. We show under which conditions such a MDP-CPDP fulfills a time consistency property, contents greedy players, and strengthen the coalition cohesiveness throughout the game. Finally we refine the concept of Core for Cooperative MDPs.
Type de document :
Article dans une revue
International Journal of Game Theory, Springer Verlag, 2013, 42 (1), pp.239-262. 〈http://link.springer.com/article/10.1007/s00182-012-0343-9〉. 〈10.1007/s00182-012-0343-9〉
Liste complète des métadonnées

https://hal.inria.fr/hal-00926471
Contributeur : Konstantin Avrachenkov <>
Soumis le : jeudi 9 janvier 2014 - 16:01:32
Dernière modification le : samedi 27 janvier 2018 - 01:31:39

Lien texte intégral

Identifiants

Collections

Citation

Konstantin Avrachenkov, Laura Cottatellucci, Lorenzo Maggi. Cooperative Markov decision processes: time consistency, greedy players satisfaction, and cooperation maintenance. International Journal of Game Theory, Springer Verlag, 2013, 42 (1), pp.239-262. 〈http://link.springer.com/article/10.1007/s00182-012-0343-9〉. 〈10.1007/s00182-012-0343-9〉. 〈hal-00926471〉

Partager

Métriques

Consultations de la notice

207