D. Aberdeen and O. Buffet, Concurrent probabilistic temporal planning with policy-gradients, ICAPS, pp.10-17, 2007.

H. Blockeel and L. De-raedt, Top-down induction of first-order logical decision trees, Artificial Intelligence, vol.101, issue.1-2, pp.285-297, 1998.
DOI : 10.1016/S0004-3702(98)00034-4

C. B. Browne, E. Powley, D. Whitehouse, S. M. Lucas, P. Cowling et al., A survey of monte carlo tree search methods. Computational Intelligence and AI in Games, IEEE Transactions on, vol.4, issue.1, pp.1-43, 2012.

S. D?eroski, L. De-raedt, and K. Driessens, Relational reinforcement learning, Machine learning, vol.43, issue.12, pp.7-52, 2001.
DOI : 10.1007/BFb0027307

J. H. Friedman, machine., The Annals of Statistics, vol.29, issue.5, pp.1189-1232, 2001.
DOI : 10.1214/aos/1013203451

T. Keller and M. Helmert, Trial-based heuristic tree search for finite horizon MDPs, ICAPS, 2013.

L. Kocsis and C. Szepesvári, Bandit Based Monte-Carlo Planning, Machine Learning: ECML 2006, pp.282-293, 2006.
DOI : 10.1007/11871842_29

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.1296

T. Lang, M. Toussaint, and K. Kersting, Exploration in relational domains for model-based reinforcement learning, The Journal of Machine Learning Research, vol.13, issue.1, pp.3725-3768, 2012.

D. S. Mausam and . Weld, Planning with durative actions in stochastic domains, J. Artif. Intell. Res.(JAIR), vol.31, pp.33-82, 2008.

T. Munzer, B. Piot, M. Geist, O. Pietquin, and M. Lopes, Inverse reinforcement learning in relational domains, International Joint Conferences on Artificial Intelligence, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01154650

S. Natarajan, T. Khot, K. Kersting, B. Gutmann, and J. Shavlik, Gradient-based boosting for statistical relational learning: The relational dependency network case, Machine Learning, pp.25-56, 2012.
DOI : 10.1007/s10994-011-5244-9

K. Rohanimanesh and S. Mahadevan, Learning to take concurrent actions, Advances in neural information processing systems, pp.1619-1626, 2002.

K. Rohanimanesh and S. Mahadevan, Coarticulation, Proceedings of the 22nd international conference on Machine learning , ICML '05, pp.720-727, 2005.
DOI : 10.1145/1102351.1102442

D. E. Smith and D. S. Weld, Temporal planning with mutual exclusion reasoning, IJCAI, pp.326-337, 1999.

H. A. Younes and R. G. Simmons, Policy generation for continuoustime stochastic domains with concurrency, ICAPS, p.325, 2004.

L. S. Zettlemoyer, H. Pasula, and L. P. Kaelbling, Learning planning rules in noisy stochastic worlds, AAAI, pp.911-918, 2005.