Reinforcement Learning Approaches in Dynamic Environments

Miyoung Han 1, 2
2 VALDA - Value from Data
DI-ENS - Département d'informatique de l'École normale supérieure, Inria de Paris
Abstract : Reinforcement learning is learning from interaction with an environment to achieve a goal. It is an efficient framework to solve sequential decision-making problems, using Markov decision processes (MDPs) as a general problem formulation. In this thesis, we apply reinforcement learning to sequential decision-making problems in dynamic environments. We first present an algorithm based on Q-learning with a customized exploration and exploitation strategy to solve a real taxi routing problem. Our algorithm is able to progressively learn optimal actions for routing an autonomous taxi to passenger pick-up points. Then, we address the factored MDP problem in a non-deterministic setting. We propose an algorithm that learns transition functions using the Dynamic Bayesian Network formalism. We demonstrate that factorization methods allow to efficiently learn correct models; through the learned models, the agent can accrue higher cumulative rewards. We extend our work to very large domains. In the focused crawling problem, we propose a new scoring mechanism taking into account long-term effects of selecting a link, and present new feature representations of states for Web pages and actions for next link selection. This approach allowed us to improve on the efficiency of focused crawling. In the influence maximization (IM) problem, we extend the classical IM problem with incomplete knowledge of graph structure and topic-based user interest. Our algorithm finds the most influential seeds to maximize topic-based influence by learning action values for each probed node.
Complete list of metadatas

Cited literature [110 references]  Display  Hide  Download

https://hal.inria.fr/tel-01891805
Contributor : Pierre Senellart <>
Submitted on : Wednesday, October 10, 2018 - 7:47:35 AM
Last modification on : Thursday, October 17, 2019 - 12:36:54 PM
Long-term archiving on : Friday, January 11, 2019 - 12:33:14 PM

File

Thesis.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-01891805, version 1

Citation

Miyoung Han. Reinforcement Learning Approaches in Dynamic Environments. Databases [cs.DB]. Télécom ParisTech, 2018. English. ⟨tel-01891805⟩

Share

Metrics

Record views

310

Files downloads

1359