Ledeepchef: Deep reinforcement learning agent for families of text-based games, 2019. ,
Safe reinforcement learning via shielding, Proc. of AAAI, 2018. ,
Learning action representations for reinforcement learning, Proc. of ICML, 2019. ,
Learning action-transferable policy with action embedding, 2019. ,
BabyAI: First steps towards grounded language learning with a human in the loop, Proc. of ICLR, 2019. ,
Textworld: A learning environment for text-based games, 2018. ,
Fast reinforcement learning with large action sets using error-correcting output codes for mdp factorization, Proc. of ECML and PKDD, pp.180-194, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00747729
, Deep reinforcement learning in large discrete action spaces, 2015.
Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers (marlin-atsc): methodology and large-scale application on downtown toronto, Proc. of TITS, 2013. ,
Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems, Journal of machine learning research, vol.7, pp.1079-1105, 2006. ,
Deep reinforcement learning with double q-learning, Proc. of AAAI, 2016. ,
Deep reinforcement learning with a natural language action space, Proc. of ACL, pp.1621-1630, 2016. ,
Deep reinforcement learning with a combinatorial action space for predicting popular reddit threads, Proc. of EMNLP, pp.1838-1848, 2016. ,
Deep q-learning from demonstrations, Proc. of AAAI, 2018. ,
Long short-term memory, Neural computation, vol.9, issue.8, pp.1735-1780, 1997. ,
Asymptotically efficient adaptive allocation rules, Advances in applied mathematics, vol.6, issue.1, pp.4-22, 1985. ,
, , 2018.
Data center cooling using model-predictive control, Proc. of NeurIPS, 2018. ,
Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 1995. ,
Resource management with deep reinforcement learning, Proc. of ACM Workshop on Hot Topics in Networks, 2016. ,
Human-level control through deep reinforcement learning, Nature, vol.518, issue.7540, p.529, 2015. ,
Policy invariance under reward transformations: Theory and application to reward shaping, Proc. of ICML, 1999. ,
Safely interruptible agents, Proc. of UAI, 2016. ,
Boosted Bellman residual minimization handling expert demonstrations, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, issue.2, pp.549-564, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01060953
Observe and look further: Achieving consistent performance on atari, 2018. ,
Markov Decision Processes.: Discrete Stochastic Dynamic Programming, 2014. ,
Some aspects of the sequential design of experiments, Bulletin of the American Mathematical Society, vol.58, issue.5, pp.527-535, 1952. ,
A general reinforcement learning algorithm that masters chess, shogi, and go through self-play, Science, vol.362, issue.6419, pp.1140-1144, 2018. ,
Reinforcement learning: An introduction, 2018. ,
The natural language of actions, Proc. of ICML, 2019. ,
Learn what not to learn: Action elimination with deep reinforcement learning, Proc. of NeurIPS, 2018. ,
Optimizing chemical reactions with deep reinforcement learning, ACS central science, vol.3, issue.12, pp.1337-1344, 2017. ,