inria-00173202, version 1
Non linear programming for stochastic dynamic programming
Olivier Teytaud
1Sylvain Gelly 1
Icinco 2007 (2007)
Abstract: Many stochastic dynamic programming tasks in continuous action-spaces are tackled through discretization. We here avoid discretization; then, approximate dynamic programming (ADP) involves (i) many learning tasks, performed here by Support Vector Machines, for Bellman-function-regression (ii) many non-linearoptimization tasks for action-selection, for which we compare many algorithms. We include discretizations of the domain as particular non-linear-programming-tools in our experiments, so that by the way we compare optimization approaches and discretization methods. We conclude that robustness is strongly required in the non-linear-optimizations in ADP, and experimental results show that (i) discretization is sometimes inefficient, but some specific discretization is very efficient for "bang-bang" problems (ii) simple evolutionary tools outperform quasi-random in a stable manner (iii) gradient-based techniques are much less stable (iv) for most high-dimensional "less unsmooth" problems Covariance-Matrix-Adaptation is first ranked.
- 1: TAO (INRIA Futurs)
- INRIA – CNRS : UMR8623 – Université Paris XI - Paris Sud
- Domain : Mathematics/Optimization and Control
- Keywords : Control – Dynamic programming – Non Linear Programming
- inria-00173202, version 1
- http://hal.inria.fr/inria-00173202
- oai:hal.inria.fr:inria-00173202
- From: Olivier Teytaud
- Submitted on: Wednesday, 19 September 2007 14:15:45
- Updated on: Wednesday, 19 September 2007 14:19:52






Associated documents
Export