K. Nakajima, Optimization of serial and parallel communications for parallel geometric multigrid method, 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS), pp.25-32, 2014.

A. Piacentini, E. Maisonnave, G. Jonville, L. Coquart, S. Valcke et al., A parallel SCRIP interpolation library for OASIS, 2018.

S. K. Gutierrez, K. Davis, D. C. Arnold, R. S. Baker, R. W. Robey et al., Accommodating thread-level heterogeneity in coupled parallel applications, 2017 IEEE International Parallel and Distributed Processing Symposium, pp.469-478, 2017.

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen et al., TensorFlow: Large-scale machine learning on heterogeneous systems, 2015.

B. Van-essen, H. Kim, R. Pearce, K. Boakye, and B. Chen, LBANN: Livermore big artificial neural network HPC toolkit, Proceedings of the Workshop on Machine Learning in High-Performance Computing Environments, ser. MLHPC '15, vol.5, pp.1-5, 2015.

E. J. Bylaska, W. A. Jong, N. Govind, and K. Kowalski, NWChem, a computational chemistry package for parallel computers, 2007.

A. Craig, S. Valcke, and L. Coquart, Development and performance of a new version of the OASIS coupler, Geoscientific Model Development, vol.10, issue.9, pp.3297-3308, 2017.

E. Maisonnave and S. Masson, NEMO 4.0 performance: how to identify and reduce unnecessary communications, Sorbonne Universités-CNRS-IRD-MNHN, 2019.

N. Dryden, N. Maruyama, T. Moon, T. Benson, A. Yoo et al., Aluminum: An asynchronous, GPU-aware communication library optimized for large-scale training of deep neural networks on HPC systems, Workshop on Machine Learning in High-Performance Computing Environments, ser. MLHPC'18, 2018.

T. Miyoshi, G. Y. Lien, S. Satoh, T. Ushio, K. Bessho et al., Big data assimilation; toward post-petascale severe weather prediction: An overview and progress, Proceedings of the IEEE, vol.104, issue.11, 2016.

T. V. Martsinkevich, B. Gerofi, G. Lien, S. Nishizawa, W. Liao et al., DTF: An I/O arbitration framework for multi-component data processing workflows, High Performance Computing, ser. ISC'18, pp.63-80, 2018.

J. M. Wozniak, R. Jain, P. Balaprakash, J. Ozik, N. T. Collier et al., CANDLE/supervisor: a workflow framework for machine learning applied to cancer research, BMC Bioinformatics, vol.19, issue.18, p.491, 2018.

R. Riesen and R. W. Wisniewski, mOS for HPC," in Operating Systems for Supercomputers and High Performance Computing, ser. High-Performance Computing, vol.18, pp.307-334, 2019.

L. Kaplan and J. Harrell, Cray Compute Node Linux, ser. High-Performance Computing, pp.99-120, 2019.

T. Kato, K. Hirai, and . Computer, High-Performance Computing, pp.183-197, 2019.

J. Hursey and J. M. Squyres, Advancing application process affinity experimentation: open MPI's LAMA-based affinity interface, p.20

, European MPI Users's Group Meeting, EuroMPI '13, pp.163-168, 2013.

H. Chen, W. Chen, J. Huang, B. Robert, and H. Kuhn, MPIPP: An automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters, Proceedings of the 20th Annual International Conference on Supercomputing, pp.353-360, 2006.

S. H. Mirsadeghi and A. Afsahi, PTRAM: A parallel topology-and routing-aware mapping framework for large-scale HPC systems, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp.386-396, 2016.

J. J. Galvez, N. Jain, and L. V. Kalé, Automatic topology mapping of diverse large-scale parallel applications, Proceedings of the International Conference on Supercomputing, vol.17, pp.1-17, 2017.

A. H. Abdel-gawad, M. Thottethodi, and A. Bhatele, RAHTM: routing algorithm aware hierarchical task mapping, International Conference for High Performance Computing, Networking, Storage and Analysis, pp.325-335, 2014.

E. Jeannot, G. Mercier, and F. Tessier, Process placement in multicore clusters: Algorithmic issues and practical techniques, IEEE Transactions on Parallel and Distributed Systems, vol.25, issue.4, pp.993-1002, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00803548

T. Hoefler and M. Snir, Generic topology mapping strategies for largescale parallel architectures, Proceedings of the 25th International Conference on Supercomputing, pp.75-84, 2011.

E. H. Cruz, M. Diener, L. L. Pilla, and P. O. Navaux, EagerMap: A task mapping algorithm to improve communication and load balancing in clusters of multicore systems, ACM Transactions on Parallel Computing, vol.5, issue.4, p.24, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02062952

J. Wu, X. Xiong, and Z. Lan, Hierarchical task mapping for parallel applications on supercomputers, The Journal of Supercomputing, vol.71, issue.5, pp.1776-1802, 2015.

E. Duesterwald, R. W. Wisniewski, P. F. Sweeney, G. Cascaval, and S. E. Smith, Method and system for optimizing communication in MPI programs for an execution environment, 2008.

. Cray, Cray performance measurement and analysis tool, 2017.

D. Solt, A profile based approach for topology aware MPI rank placement, 2007.

J. Treibig, G. Hager, and G. Wellein, LIKWID: Lightweight performance tools, Competence in High Performance Computing, pp.165-175, 2010.

O. A. Board, OpenMP application programming interface, 2018.

A. Computing, Torque resource manager

N. Capit, G. Costa, Y. Georgiou, G. Huard, C. Martin et al., A batch scheduler with high level components, Cluster Computing and Grid 2005 (CCGrid05), 2005.
URL : https://hal.archives-ouvertes.fr/hal-00005106

D. H. Ahn, J. Garlick, M. Grondona, D. Lipari, B. Springmeyer et al., Flux: A next-generation resource management framework for large HPC centers, 43rd International Conference on Parallel Processing Workshops, pp.9-17, 2014.

B. Burns, B. Grant, D. Oppenheimer, E. Brewer, J. Wilkes et al., ACM Queue, vol.14, pp.70-93, 2016.

H. Pan, B. Hindman, and K. Asanovic, Composing parallel software efficiently with Lithe, Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2010, pp.376-387, 2010.

B. Kocoloski, J. Lange, K. Pedretti, and R. Brightwell, Hobbes: A multi-kernel infrastructure for application composition, Operating Systems for Supercomputers and High Performance Computing

B. Gerofi, Y. Ishikawa, R. Riesen, and R. W. Wisniewski, , 2019.

S. Perarnau, R. Gupta, P. Beckman, P. Balaji, C. Bordage et al., ARGO: An exascale operating system and runtime, International Conference for High Performance Computing, Networking, Storage and Analysis, 2015.

B. Gerofi, M. Takagi, A. Hori, G. Nakamura, T. Shirasawa et al., On the scalability, performance isolation and device driver transparency of the IHK/McKernel hybrid lightweight kernel, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp.1041-1050, 2016.

G. Vallee, C. E. Gutierrez, and C. Clerget, On-node resource manager for containerized HPC workloads, Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC, ser. CANOPIE-HPC'19, 2019.