Optimization of serial and parallel communications for parallel geometric multigrid method, 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS), pp.25-32, 2014. ,
A parallel SCRIP interpolation library for OASIS, 2018. ,
Accommodating thread-level heterogeneity in coupled parallel applications, 2017 IEEE International Parallel and Distributed Processing Symposium, pp.469-478, 2017. ,
TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. ,
LBANN: Livermore big artificial neural network HPC toolkit, Proceedings of the Workshop on Machine Learning in High-Performance Computing Environments, ser. MLHPC '15, vol.5, pp.1-5, 2015. ,
NWChem, a computational chemistry package for parallel computers, 2007. ,
Development and performance of a new version of the OASIS coupler, Geoscientific Model Development, vol.10, issue.9, pp.3297-3308, 2017. ,
NEMO 4.0 performance: how to identify and reduce unnecessary communications, Sorbonne Universités-CNRS-IRD-MNHN, 2019. ,
Aluminum: An asynchronous, GPU-aware communication library optimized for large-scale training of deep neural networks on HPC systems, Workshop on Machine Learning in High-Performance Computing Environments, ser. MLHPC'18, 2018. ,
Big data assimilation; toward post-petascale severe weather prediction: An overview and progress, Proceedings of the IEEE, vol.104, issue.11, 2016. ,
DTF: An I/O arbitration framework for multi-component data processing workflows, High Performance Computing, ser. ISC'18, pp.63-80, 2018. ,
CANDLE/supervisor: a workflow framework for machine learning applied to cancer research, BMC Bioinformatics, vol.19, issue.18, p.491, 2018. ,
mOS for HPC," in Operating Systems for Supercomputers and High Performance Computing, ser. High-Performance Computing, vol.18, pp.307-334, 2019. ,
Cray Compute Node Linux, ser. High-Performance Computing, pp.99-120, 2019. ,
High-Performance Computing, pp.183-197, 2019. ,
Advancing application process affinity experimentation: open MPI's LAMA-based affinity interface, p.20 ,
, European MPI Users's Group Meeting, EuroMPI '13, pp.163-168, 2013.
MPIPP: An automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters, Proceedings of the 20th Annual International Conference on Supercomputing, pp.353-360, 2006. ,
PTRAM: A parallel topology-and routing-aware mapping framework for large-scale HPC systems, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp.386-396, 2016. ,
Automatic topology mapping of diverse large-scale parallel applications, Proceedings of the International Conference on Supercomputing, vol.17, pp.1-17, 2017. ,
RAHTM: routing algorithm aware hierarchical task mapping, International Conference for High Performance Computing, Networking, Storage and Analysis, pp.325-335, 2014. ,
Process placement in multicore clusters: Algorithmic issues and practical techniques, IEEE Transactions on Parallel and Distributed Systems, vol.25, issue.4, pp.993-1002, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-00803548
Generic topology mapping strategies for largescale parallel architectures, Proceedings of the 25th International Conference on Supercomputing, pp.75-84, 2011. ,
EagerMap: A task mapping algorithm to improve communication and load balancing in clusters of multicore systems, ACM Transactions on Parallel Computing, vol.5, issue.4, p.24, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02062952
Hierarchical task mapping for parallel applications on supercomputers, The Journal of Supercomputing, vol.71, issue.5, pp.1776-1802, 2015. ,
Method and system for optimizing communication in MPI programs for an execution environment, 2008. ,
Cray performance measurement and analysis tool, 2017. ,
A profile based approach for topology aware MPI rank placement, 2007. ,
LIKWID: Lightweight performance tools, Competence in High Performance Computing, pp.165-175, 2010. ,
OpenMP application programming interface, 2018. ,
Torque resource manager ,
A batch scheduler with high level components, Cluster Computing and Grid 2005 (CCGrid05), 2005. ,
URL : https://hal.archives-ouvertes.fr/hal-00005106
Flux: A next-generation resource management framework for large HPC centers, 43rd International Conference on Parallel Processing Workshops, pp.9-17, 2014. ,
, ACM Queue, vol.14, pp.70-93, 2016.
Composing parallel software efficiently with Lithe, Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2010, pp.376-387, 2010. ,
Hobbes: A multi-kernel infrastructure for application composition, Operating Systems for Supercomputers and High Performance Computing ,
, , 2019.
ARGO: An exascale operating system and runtime, International Conference for High Performance Computing, Networking, Storage and Analysis, 2015. ,
On the scalability, performance isolation and device driver transparency of the IHK/McKernel hybrid lightweight kernel, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp.1041-1050, 2016. ,
On-node resource manager for containerized HPC workloads, Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC, ser. CANOPIE-HPC'19, 2019. ,