Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming

Eugene A. Feinberg 1 Jefferson Huang 1 Bruno Scherrer 2
2 MAIA - Autonomous intelligent machine
Inria Nancy - Grand Est, LORIA - AIS - Department of Complex Systems, Artificial Intelligence & Robotics
Abstract : This note shows that the number of arithmetic operations required by any member of a broad class of optimistic policy iteration algorithms to solve a deterministic discounted dynamic programming problem with three states and four actions may grow arbitrarily. Therefore any such algorithm is not strongly polynomial. In particular, the modified policy iteration and $\lambda$-policy iteration algorithms are not strongly polynomial.
Document type :
Journal articles
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download

https://hal.inria.fr/hal-01091370
Contributor : Bruno Scherrer <>
Submitted on : Friday, December 5, 2014 - 11:48:44 AM
Last modification on : Tuesday, August 13, 2019 - 11:22:21 AM
Long-term archiving on : Monday, March 9, 2015 - 6:02:58 AM

Files

Feinberg_Huang_Scherrer.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Eugene A. Feinberg, Jefferson Huang, Bruno Scherrer. Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming. Operations Research Letters, Elsevier, 2014, 42, pp.429 - 431. ⟨10.1016/j.orl.2014.07.006⟩. ⟨hal-01091370⟩

Share

Metrics

Record views

843

Files downloads

425