P. Anderson, A. Chang, D. S. Chaplot, A. Dosovitskiy, S. Gupta et al., On evaluation of embodied navigation agents, 2018.

K. J. Åström, Optimal control of markov processes with incomplete state information, Journal of Mathematical Analysis and Applications, vol.10, issue.1, pp.174-205, 1965.

P. Battaglia, R. Pascanu, M. Lai, and D. J. Rezende, Interaction networks for learning about objects, relations and physics, Advances in neural information processing systems, pp.4502-4510, 2016.

P. W. Battaglia, J. B. Hamrick, V. Bapst, A. Sanchez-gonzalez, V. Zambaldi et al., Relational inductive biases, deep learning, and graph networks, 2018.

P. Battaglia, J. Hamrick, V. Bapst, A. Sanchez-gonzalez, V. Zambaldi et al., Relational inductive biases, deep learning, and graph networks, 2018.

E. Beeching, C. Wolf, J. Dibangoye, and O. Simonin, Egomap: Projective mapping and structured egocentric memory for deep rl, 2020.
URL : https://hal.archives-ouvertes.fr/hal-02864146

R. Bellman, On a routing problem, Quarterly of applied mathematics, vol.16, issue.1, pp.87-90, 1958.

S. Bhatti, A. Desmaison, O. Miksik, N. Nardelli, N. Siddharth et al., Playing doom with slam-augmented deep reinforcement learning, 2016.

M. M. Bronstein, J. Bruna, Y. Lecun, A. Szlam, and P. Vandergheynst, Geometric deep learning: going beyond euclidean data, IEEE Signal Processing Magazine, vol.34, issue.4, pp.18-42, 2017.

D. S. Chaplot, D. Gandhi, S. Gupta, A. Gupta, and R. Salakhutdinov, Learning to explore using active neural slam, International Conference on Learning Representations, 2020.

T. Chen, S. Gupta, and A. Gupta, Learning exploration policies for navigation, International Conference on Learning Representations, 2019.

J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling, 2014.

J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, Gated Feedback Recurrent Neural Networks. In: ICML, 2015.

Y. N. Dauphin, A. Fan, M. Auli, and D. Grangier, Language modeling with gated convolutional networks, Proceedings of the 34th International Conference on Machine Learning, vol.70, pp.933-941, 2017.

E. W. Dijkstra, A note on two problems in connexion with graphs, Numerische mathematik, vol.1, issue.1, pp.269-271, 1959.

B. Eysenbach, R. R. Salakhutdinov, and S. Levine, Search on the replay buffer: Bridging planning and reinforcement learning, Advances in Neural Information Processing Systems, vol.32, pp.15220-15231, 2019.

A. Fout, J. Byrd, B. Shariat, and A. Ben-hur, Protein interface prediction using graph convolutional networks, Advances in neural information processing systems, pp.6530-6539, 2017.

J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, Neural message passing for quantum chemistry, Proceedings of the 34th International Conference on Machine Learning, vol.70, pp.1263-1272, 2017.

A. Graves, G. Wayne, and I. Danihelka, Neural turing machines, 2014.

A. Graves, G. Wayne, M. Reynolds, T. Harley, I. Danihelka et al., Hybrid computing using a neural network with dynamic external memory, Nature, vol.538, issue.7626, p.471, 2016.

S. Gupta, J. Davidson, S. Levine, R. Sukthankar, and J. Malik, Cognitive mapping and planning for visual navigation, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.7272-7281, 2017.

,

S. Gupta, D. Fouhey, S. Levine, and J. Malik, Unifying map and landmark based representations for visual navigation, 2017.

S. Hochreiter and J. Schmidhuber, Long Short-Term Memory, Neural Computation, vol.9, issue.8, pp.1735-1780, 1997.

M. Jaderberg, V. Mnih, W. M. Czarnecki, T. Schaul, J. Z. Leibo et al., Reinforcement learning with unsupervised auxiliary tasks, 2017.

C. K. Joshi, T. Laurent, and X. Bresson, An efficient graph convolutional network technique for the travelling salesman problem, 2019.

L. P. Kaelbling, M. L. Littman, and A. R. Cassandra, Planning and acting in partially observable stochastic domains, Artificial intelligence, vol.101, issue.1-2, pp.99-134, 1998.

P. Karkus, D. Hsu, and W. S. Lee, Qmdp-net: Deep learning for planning under partial observability, 2017.

M. Kempka, M. Wydmuch, G. Runc, J. Toczek, and W. Jaskowski, ViZ-Doom: A Doom-based AI research platform for visual reinforcement learning, IEEE Conference on Computatonal Intelligence and Games, 2017.

,

T. N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, International Conference on Learning Representations, 2017.

H. Kurniawati, Sarsop: Efficient point-based pomdp planning by approximating optimally reachable belief spaces, Proc. Robotics: Science and Systems, 2008.

S. M. Lavalle, Planning algorithms, 2006.
URL : https://hal.archives-ouvertes.fr/hal-01993243

Y. Lecun, L. Eon-bottou, Y. Bengio, and P. Haaner, Gradient-Based Learning Applied to Document Recognition, Proceedings of the IEEE, vol.86, issue.11, pp.2278-2324, 1998.

Z. Li, Q. Chen, and V. Koltun, Combinatorial optimization with graph convolutional networks and guided tree search, Advances in Neural Information Processing Systems, pp.539-548, 2018.

M. Savva, *. , A. Kadian, *. , O. Maksymets et al., Habitat: A Platform for Embodied AI Research, Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019.

P. Mirowski, M. K. Grimes, M. Malinowski, K. M. Hermann, K. Anderson et al., Learning to Navigate in Cities Without a Map, 2018.

P. Mirowski, R. Pascanu, F. Viola, H. Soyer, A. J. Ballard et al., Learning to Navigate in Complex Environments, 2017.

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness et al., Human-level control through deep reinforcement learning, Nature, vol.518, issue.7540, 2015.

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness et al., Human-level control through deep reinforcement learning, Nature, vol.518, 2015.

N. Neverova, C. Wolf, G. Taylor, and F. Nebout, Moddrop: adaptive multi-modal gesture recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.38, issue.8, pp.1692-1706, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01178733

E. Parisotto and R. Salakhutdinov, Neural map: Structured memory for deep reinforcement learning, 2018.

A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury et al., Pytorch: An imperative style, high-performance deep learning library

, Advances in Neural Information Processing Systems, vol.32, pp.8024-8035, 2019.

E. Remolina and B. Kuipers, Towards a general theory of topological maps, Artif. Intell, vol.152, pp.47-104, 2004.

N. Savinov, A. Dosovitskiy, and V. Koltun, Semi-parametric topological memory for navigation, International Conference on Learning Representations, 2018.

M. Schlichtkrull, T. N. Kipf, P. Bloem, . Van-den, R. Berg et al., Modeling relational data with graph convolutional networks, European Semantic Web Conference, pp.593-607, 2018.

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, Proximal policy optimization algorithms, 2017.

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, Proximal policy optimization algorithms, 2017.

G. Shani, J. Pineau, and R. Kaplow, A survey of point-based pomdp solvers, Autonomous Agents and Multi-Agent Systems, vol.27, issue.1, pp.1-51, 2013.

H. Shatkay and L. P. Kaelbling, Learning topological maps with weak local odometric information, pp.920-929, 1997.

D. Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai et al., A general reinforcement learning algorithm that masters chess, shogi, and go through self-play, Science, vol.362, issue.6419, pp.1140-1144, 2018.

R. D. Smallwood and E. J. Sondik, The optimal control of partially observable markov processes over a finite horizon, Operations research, vol.21, issue.5, pp.1071-1088, 1973.

T. Smith and R. Simmons, Heuristic search value iteration for pomdps, Proceedings of the 20th conference on Uncertainty in artificial intelligence, pp.520-527, 2004.

A. Srinivas, A. Jabri, P. Abbeel, S. Levine, and C. Finn, Universal planning networks, 2018.

A. Tamar, Y. Wu, G. Thomas, S. Levine, and P. Abbeel, Value iteration networks, 2016.

S. Thrun, Learning metric-topological maps for indoor mobile robot navigation, Artificial Intelligence, vol.99, issue.1, pp.21-71, 1998.

R. F. Wang and E. S. Spelke, Human spatial representation: Insights from animals, Trends in Cognitive Sciences, vol.6, issue.9, pp.1961-1968, 2002.

G. Wayne, C. C. Hung, D. Amos, M. Mirza, A. Ahuja et al., Unsupervised predictive memory in a goal-directed agent, 2018.

Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang et al., A comprehensive survey on graph neural networks, 2019.

F. Xia, R. Zamir, A. He, Z. Y. Sax, A. Malik et al., Gibson env: realworld perception for embodied agents, Computer Vision and Pattern Recognition (CVPR), 2018.

K. Xu, J. Li, M. Zhang, S. Du, K. Kawarabayashi et al., What can neural networks reason about? arxiv preprint 1905, p.13211, 2019.

K. Xu, J. Li, M. Zhang, S. S. Du, K. I. Kawarabayashi et al., What can neural networks reason about, 2019.

J. Zhang, L. Tai, J. Boedecker, W. Burgard, and M. Liu, Neural SLAM, 2017.

J. Zhou, G. Cui, Z. Zhang, C. Yang, Z. Liu et al., Graph neural networks: A review of methods and applications, 2018.

Y. Zhu, R. Mottaghi, E. Kolve, J. J. Lim, A. Gupta et al., Target-driven visual navigation in indoor scenes using deep reinforcement learning, 2017 IEEE international conference on robotics and automation (ICRA), pp.3357-3364, 2017.