Apprenticeship learning via inverse reinforcement learning, Twenty-first international conference on Machine learning , ICML '04, 2004. ,
DOI : 10.1145/1015330.1015430
Preference-Based Policy Learning, pp.12-27 ,
DOI : 10.1007/978-3-642-23780-5_11
URL : https://hal.archives-ouvertes.fr/inria-00625001
Multiple instance ranking, Proceedings of the 25th international conference on Machine learning, ICML '08, pp.48-55, 2008. ,
DOI : 10.1145/1390156.1390163
Active preference learning with discrete choice data, Advances in Neural Information Processing Systems 20, pp.409-416, 2008. ,
On Learning, Representing, and Generalizing a Task in a Humanoid Robot, IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), vol.37, issue.2, pp.286-298, 2007. ,
DOI : 10.1109/TSMCB.2006.886952
Preference-Based Policy Iteration: Leveraging Preference Learning for Reinforcement Learning, pp.312-327 ,
DOI : 10.1007/978-3-642-23780-5_30
Support-vector networks, Machine Learning, vol.1, issue.3, pp.273-297, 1995. ,
DOI : 10.1007/BF00994018
Coarse sample complexity bounds for active learning, Advances in Neural Information Processing Systems, 2005. ,
Pattern Classification and scene analysis, 1973. ,
Feature Selection for Reinforcement Learning: Evaluating Implicit State-Reward Dependency via Conditional Mutual Information, Proc. ECML/PKDD, pp.474-489, 2010. ,
DOI : 10.1007/978-3-642-15880-3_36
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.386.2141
Completely Derandomized Self-Adaptation in Evolution Strategies, Evolutionary Computation, vol.9, issue.2, pp.159-195, 2001. ,
DOI : 10.1016/0004-3702(95)00124-7
Hoeffding and Bernstein races for selecting policies in evolutionary direct policy search, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, p.51, 2009. ,
DOI : 10.1145/1553374.1553426
Bayes point machines, Journal of Machine Learning Research, vol.1, pp.245-279, 2001. ,
A support vector method for multivariate performance measures, Proceedings of the 22nd international conference on Machine learning , ICML '05, pp.377-384, 2005. ,
DOI : 10.1145/1102351.1102399
Training linear SVMs in linear time, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '06, pp.217-226, 2006. ,
DOI : 10.1145/1150402.1150429
Efficient global optimization of expensive black-box functions, Journal of Global Optimization, vol.13, issue.4, pp.455-492, 1998. ,
DOI : 10.1023/A:1008306431147
Hierarchical apprenticeship learning with application to quadruped locomotion, 2007. ,
Constructing skill trees for reinforcement learning agents from demonstration trajectories, Advances in Neural Information Processing Systems 23, pp.1162-1170, 2010. ,
Least-squares policy iteration, Journal of Machine Learning Research (JMLR), vol.4, pp.1107-1149, 2003. ,
Locomotion control of quadruped robots based on cpginspired workspace trajectory generation, Proc. ICRA, pp.1250-1255, 2011. ,
Algorithms for inverse reinforcement learning, Proc. of the Seventeenth International Conference on Machine Learning (ICML-00, pp.663-670, 2000. ,
A sensorimotor account of vision and visual consciousness, Behavioral and Brain Sciences, vol.24, issue.05, p.939973, 2001. ,
DOI : 10.1017/S0140525X01000115
Reinforcement learning of motor skills with policy gradients, Neural Networks, vol.21, issue.4, pp.682-697, 2008. ,
DOI : 10.1016/j.neunet.2008.02.003
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,
DOI : 10.1109/TNN.1998.712192
Algorithms for Reinforcement Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning, vol.4, issue.1, 2010. ,
DOI : 10.2200/S00268ED1V01Y201005AIM009
Large margin methods for structured and interdependent output variables, Journal of Machine Learning Research, vol.6, pp.1453-1484, 2005. ,
Monte Carlo Methods for Preference Learning, Proc. Learning and Intelligent OptimizatioN, LION 6, 2012. ,
DOI : 10.1007/978-3-642-34413-8_52
Optimal Bayesian recommendation sets and myopically optimal choice query sets, pp.2352-2360, 2010. ,
Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning, Autonomous Agents and Multi-Agent Systems, vol.87, issue.9, pp.1-27, 2010. ,
DOI : 10.1007/s10458-009-9100-2
Reinforcement learning design for cancer clinical trials, Statistics in Medicine, vol.22, issue.1, 2009. ,
DOI : 10.1002/sim.3720