B. Han, V. Gopalakrishnan, L. Ji, and S. Lee, Network function virtualization: Challenges and opportunities for innovations, IEEE Communications Magazine, vol.53, issue.2, pp.90-97, 2015.

I. Afolabi, T. Taleb, K. Samdanis, A. Ksentini, and H. Flinck, Network slicing and softwarization: A survey on principles, enabling technologies, and solutions, IEEE Communications Surveys Tutorials, vol.20, issue.3, pp.2429-2453, 2018.

I. Jang, D. Suh, S. Pack, and G. Dán, Joint optimization of service function placement and flow distribution for service function chaining, IEEE Journal on Selected Areas in Communications, vol.35, issue.11, pp.2532-2541, 2017.

G. Mirjalily and Z. Luo, Optimal network function virtualization and service function chaining: A survey, Chinese Journal of Electronics, vol.27, pp.704-717, 2018.

S. Vassilaras, L. Gkatzikis, N. Liakopoulos, I. N. Stiakogiannakis, M. Qi et al., The algorithmic aspects of network slicing, IEEE Communications Magazine, vol.55, issue.8, pp.112-119, 2017.
URL : https://hal.archives-ouvertes.fr/hal-02305665

S. Khebbache, M. Hadji, and D. Zeghlache, Scalable and cost-efficient algorithms for VNF chaining and placement problem, Proc. ICIN, pp.92-99, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01630011

, A multi-objective non-dominated sorting genetic algorithm for VNF chains placement, Proc. IEEE CCNC, pp.1-4, 2018.

R. Mijumbi, J. Gorricho, J. Serrat, M. Claeys, F. D. Turck et al., Design and evaluation of learning algorithms for dynamic resource management in virtual networks, IEEE NOMS, pp.1-9, 2014.

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez et al., Continuous control with deep reinforcement learning, CoRR, 2015.

G. Dulac-arnold, R. Evans, P. Sunehag, and B. Coppin, Reinforcement learning in large discrete action spaces, CoRR, 2015.

Z. Xu, J. Tang, J. Meng, W. Zhang, Y. Wang et al., Experience-driven Networking: A Deep Reinforcement Learning based Approach, IEEE INFOCOM, pp.1871-1879, 2018.

P. T. Quang, A. Bradai, K. D. Singh, G. Picard, and R. Riggio, Single and Multi-Domain Adaptive Allocation Algorithms for VNF Forwarding Graph Embedding, IEEE Transactions on Network and Service Management, vol.16, issue.1, pp.98-112, 2019.

R. Riggio, A. Bradai, D. Harutyunyan, T. Rasheed, and T. Ahmed, Scheduling wireless virtual networks functions, IEEE Transactions on Network and Service Management, vol.13, issue.2, pp.240-252, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01292250

G. Stampa, M. Arias, D. Sanchez-charles, V. Muntés-mulero, and A. Cabellos, A deep-reinforcement learning approach for softwaredefined networking routing optimization, CoRR, 2017.

Y. Xie, Z. Liu, S. Wang, and Y. Wang, Service function chaining resource allocation: A survey, CoRR, 2016.

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness et al., Human-level control through deep reinforcement learning, Nature, vol.518, issue.7540, pp.529-533, 2015.

, The internet topology zoo

A. Varga, Discrete event simulation system, Proc. of the European Simulation Multiconference, 2011.

P. Erds and A. Rényi, On the evolution of random graphs, Publ. Math. Inst. Hungar. Acad. Sci, vol.5, pp.17-61, 1960.

X. Glorot, A. Bordes, and Y. Bengio, Deep sparse rectifier neural networks, Proc. AIStats, pp.315-323, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00752497

D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, CoRR, 2014.

P. T. Quang, Y. Hadjadj-aoul, and A. Outtagarts, A deep reinforcement learning approach for VNF Forwarding Graph Embedding, IEEE Transactions on Network and Service Management, vol.16, issue.4, 2019.

D. Tai, H. Dai, T. Zhang, and B. Liu, On data plane latency and pseudo-TCP congestion in Software-Defined Networking, Proc. ACM/IEEE ANCS, pp.133-134, 2016.

S. Khebbache, M. Hadji, and D. Zeghlache, Virtualized network functions chaining and routing algorithms, Computer Networks, vol.114, pp.95-110, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01471730