P. Auer, N. Cesa-bianchi, and P. Fischer, Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, pp.2-3235, 2002.

E. Baker, Technical Note???An Exact Algorithm for the Time-Constrained Traveling Salesman Problem, Operations Research, vol.31, issue.5, pp.938-945, 1983.
DOI : 10.1287/opre.31.5.938

T. Cazenave, Monte Carlo Beam Search, IEEE Transactions on Computational Intelligence and AI in Games, vol.4, issue.1, pp.68-72, 2012.
DOI : 10.1109/TCIAIG.2011.2180723

URL : https://hal.archives-ouvertes.fr/hal-01498618

T. Cazenave, Nested Monte-Carlo search, IJCAI, pp.456-461, 2009.
DOI : 10.1109/ipdps.2009.5161122

URL : https://hal.archives-ouvertes.fr/hal-01436286

T. Cazenave and F. Teytaud, Application of the Nested Rollout Policy Adaptation Algorithm to the Traveling Salesman Problem with Time Windows, LION, pp.42-54, 2011.
DOI : 10.1007/978-3-642-34413-8_4

URL : https://hal.archives-ouvertes.fr/hal-01406457

N. Christofides, A. Mingozzi, and P. Toth, State-space relaxation procedures for the computation of bounds to routing problems, Networks, vol.4, issue.2, pp.145-164, 1981.
DOI : 10.1002/net.3230110207

Y. Dumas, J. Desrosiers, E. Gelinas, and M. Solomon, An Optimal Algorithm for the Traveling Salesman Problem with Time Windows, Operations Research, vol.43, issue.2, pp.367-371, 1995.
DOI : 10.1287/opre.43.2.367

S. Edelkamp and M. Gath, Optimal decision making in agent-based autonomous groupage traffic, ICAART, 2013.

H. Finnsson and Y. Björnsson, Simulation-based approach to general game playing, AAAI, pp.1134-1139, 2008.

F. Focacci, A. Lodi, and M. Milano, A Hybrid Exact Algorithm for the TSPTW, INFORMS Journal on Computing, vol.14, issue.4, pp.403-417, 2002.
DOI : 10.1287/ijoc.14.4.403.2827

S. Gelly and Y. Wang, Exploration exploitation in Go: UCT for Monte- Carlo Go, NIPS-Workshop on On-line Trading of Exploration and Exploitation, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00115330

M. Gendreau, A. Hertz, G. Laporte, and M. Stan, A Generalized Insertion Heuristic for the Traveling Salesman Problem with Time Windows, Operations Research, vol.46, issue.3, pp.330-335, 1998.
DOI : 10.1287/opre.46.3.330

R. Jonker and A. Volgenant, Improving the Hungarian assignment algorithm, Operations Research Letters, vol.5, issue.4, pp.171-175, 1986.
DOI : 10.1016/0167-6377(86)90073-8

L. Kocsis and C. Szepesvari, Bandit Based Monte-Carlo Planning, ICML, pp.282-293, 2006.
DOI : 10.1007/11871842_29

M. Lopez-ibanez and C. Blum, Beam-ACO for the travelling salesman problem with time windows, Computers & Operations Research, vol.37, issue.9, pp.1570-1583, 2010.
DOI : 10.1016/j.cor.2009.11.015

N. C. Love, T. L. Hinrichs, and M. R. Genesereth, General game playing: Game description language specification, 2006.

S. Parragh, K. Doerner, and R. Hartl, A survey on pickup and delivery problems, Journal f??r Betriebswirtschaft, vol.1667, issue.2, pp.81-117, 2008.
DOI : 10.1007/s11301-008-0036-4

G. Pesant, M. Gendreau, J. Potvin, and J. Rousseau, An Exact Constraint Logic Programming Algorithm for the Traveling Salesman Problem with Time Windows, Transportation Science, vol.32, issue.1, pp.12-29, 1998.
DOI : 10.1287/trsc.32.1.12

J. Potvin and S. Bengio, The Vehicle Routing Problem with Time Windows Part II: Genetic Search, INFORMS Journal on Computing, vol.8, issue.2, p.165, 1996.
DOI : 10.1287/ijoc.8.2.165

A. Rimmel, F. Teytaud, and T. Cazenave, Optimization of the Nested Monte-Carlo Algorithm on the Traveling Salesman Problem with Time Windows, Applications of Evolutionary Computation, pp.501-510, 2011.
DOI : 10.1287/opre.35.2.254

URL : https://hal.archives-ouvertes.fr/inria-00563668

A. Rimmel, F. Teytaud, and O. Teytaud, Biasing Monte-Carlo Simulations through RAVE Values, Computers and Games, pp.59-68, 2011.
DOI : 10.1007/978-3-642-17928-0_6

URL : https://hal.archives-ouvertes.fr/inria-00485555

C. D. Rosin, Nested rollout policy adaptation for Monte-Carlo tree search, IJCAI, pp.649-654, 2011.

M. Solomon, Algorithms for the Vehicle Routing and Scheduling Problems with Time Window Constraints, Operations Research, vol.35, issue.2, pp.254-265, 1987.
DOI : 10.1287/opre.35.2.254