Non linear programming for stochastic dynamic programming

Olivier Teytaud 1 Sylvain Gelly 1
1 TANC - Algorithmic number theory for cryptology
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], Inria Saclay - Ile de France, X - École polytechnique, CNRS - Centre National de la Recherche Scientifique : UMR7161
Abstract : Many stochastic dynamic programming tasks in continuous action-spaces are tackled through discretization. We here avoid discretization; then, approximate dynamic programming (ADP) involves (i) many learning tasks, performed here by Support Vector Machines, for Bellman-function-regression (ii) many non-linearoptimization tasks for action-selection, for which we compare many algorithms. We include discretizations of the domain as particular non-linear-programming-tools in our experiments, so that by the way we compare optimization approaches and discretization methods. We conclude that robustness is strongly required in the non-linear-optimizations in ADP, and experimental results show that (i) discretization is sometimes inefficient, but some specific discretization is very efficient for "bang-bang" problems (ii) simple evolutionary tools outperform quasi-random in a stable manner (iii) gradient-based techniques are much less stable (iv) for most high-dimensional "less unsmooth" problems Covariance-Matrix-Adaptation is first ranked.
Type de document :
Communication dans un congrès
Icinco 2007, 2007, Angers, France. 2007
Liste complète des métadonnées

Littérature citée [38 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00173202
Contributeur : Olivier Teytaud <>
Soumis le : mercredi 19 septembre 2007 - 14:15:45
Dernière modification le : jeudi 10 mai 2018 - 02:06:30
Document(s) archivé(s) le : vendredi 9 avril 2010 - 02:28:16

Fichier

sefordp.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00173202, version 1

Collections

Citation

Olivier Teytaud, Sylvain Gelly. Non linear programming for stochastic dynamic programming. Icinco 2007, 2007, Angers, France. 2007. 〈inria-00173202〉

Partager

Métriques

Consultations de la notice

255

Téléchargements de fichiers

599