Skip to Main content Skip to Navigation
Conference papers

Monte Carlo Information-Oriented Planning

Vincent Thomas 1 Gérémy Hutin 2 Olivier Buffet 1
1 LARSEN - Lifelong Autonomy and interaction skills for Robots in a Sensing ENvironment
Inria Nancy - Grand Est, LORIA - AIS - Department of Complex Systems, Artificial Intelligence & Robotics
Abstract : In this article, we discuss how to solve information-gathering problems expressed as ρ-POMDPs, an extension of Partially Observable Markov Decision Processes (POMDPs) whose reward ρ depends on the belief state. Point-based approaches used for solving POMDPs have been extended to solving ρ-POMDPs as belief MDPs when its reward ρ is convex in B or when it is Lipschitz-continuous. In the present paper, we build on the POMCP algorithm to propose a Monte Carlo Tree Search for ρ-POMDPs, aiming for an efficient on-line planner which can be used for any ρ function. Adaptations are required due to the belief-dependent rewards to (i) propagate more than one state at a time, and (ii) prevent biases in value estimates. An asymptotic convergence proof to-optimal values is given when ρ is continuous. Experiments are conducted to analyze the algorithms at hand and show that they outperform myopic approaches.
Complete list of metadata

Cited literature [26 references]  Display  Hide  Download
Contributor : Vincent Thomas Connect in order to contact the contributor
Submitted on : Friday, September 18, 2020 - 3:09:49 PM
Last modification on : Wednesday, October 27, 2021 - 8:00:44 AM
Long-term archiving on: : Friday, December 4, 2020 - 10:25:16 PM


Files produced by the author(s)


  • HAL Id : hal-02943028, version 1


Vincent Thomas, Gérémy Hutin, Olivier Buffet. Monte Carlo Information-Oriented Planning. 24th ECAI 2020 - European Conference on Artificial Intelligence, Aug 2020, Santiago de Compostela, Spain. ⟨hal-02943028⟩



Record views


Files downloads