An Overview of Process Mapping Techniques and Algorithms in High-Performance Computing, High Performance Computing on Complex Environments. Sous la dir. d'Emmanuel JEANNOT et Julius ?ILINSKAS, pp.65-84, 2014. ,
, Topology-Aware Job Mapping. » In : International Journal of high-Performance Computing applications, vol.32, pp.14-27, 2018.
Hardware topology management in MPI applications through hierarchical communicators, Parallel Computing, vol.76, pp.70-90, 2018. ,
, Process Placement in Multicore Clusters : Algorithmic Issues and Practical Techniques, vol.25, pp.993-1002, 2014.
« Implementation and Evaluation of Shared-Memory Communication and Synchronization Operations in MPICH2 using the Nemesis Communication Subsystem, Parallel Computing, vol.33, pp.634-644, 2007. ,
« High Performance Computing on Heterogeneous Clusters with the Madeleine II Communication Library, Cluster Computing, vol.5, pp.43-54, 2002. ,
, Conférences internationales avec comité de lecture
« Exposition, Clarification, and Expansion of MPI Semantic Terms and Conventions : Is a nonblocking function permitted to block ?, Proceedings of the 26th European MPI Users' Group Meeting, 2019. ,
Online Dynamic Monitoring of MPI Communications, Euro-Par 2017 : Parallel Processing -23rd International Conference on Parallel and Distributed Computing, pp.49-62, 2017. ,
Topology-aware resource management for HPC applications, Proceedings of the 18th International Conference on Distributed Computing and Networking, p.17, 2017. ,
A hierarchical model to manage hardware topology in MPI applications, Proceedings of the 24th ,
, Group Meeting, vol.9, pp.1-9, 2017.
« Communication and Topology-aware Load Balancing in Charm++ with TreeMatch, IEEE Cluster, pp.1-8, 2013. ,
« Improving MPI Applications Performance on Multicore Clusters with Rank Reordering, pp.39-49, 2011. ,
Hwloc : a Generic Framework for Managing Hardware Affinities in HPC Applications, Proceedings of the 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP2010), 2010. ,
Sous la dir. de Pasqua D'AMBRA, Mario Rosario GUARRACINO et Domenico TALIA. T. 6272, Euro-Par 2010 Parallel Processing Europar, pp.199-210, 2010. ,
Cache-Efficient, Intranode Large-Message MPI Communication with MPICH2-Nemesis, Proceedings of the 38th International Conference on Parallel Processing, 2009. ,
Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments, EuroPVM/MPI. T. 5759, pp.104-115, 2009. ,
An Efficient Support for High-Performance Networks in MPICH2, Proceedings of 23rd IEEE International Parallel and Distributed Processing Symposium (IPDPS'09), 2009. ,
« Data Transfer in a SMP System : Study and Application to MPI », Proc. 35th International Conference on Parallel Processing, 2006. ,
« Design and Evaluation of Nemesis : a Scalable, Low-Latency, Message-Passing Communication Subsystem », Proc. 6th IEEE/ACM International Symposium on Cluster Computing and the Grid, 2006. ,
« Implementation and Shared-Memory Evaluation of MPICH2 over the Nemesis Communication Subsystem, Recent Advances in Parallel Virtual Machine and Message Passing Interface : Proceedings of the 13th European PVM/MPI Users Group Meeting (Euro PVM/MPI, 2006. ,
, , 2006.
Cluster of Clusters-Enabled MPI Implementation », Proc. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, pp.26-35, 2003. ,
True Multi-Protocol MPI for High-Performance Networks », Proc. 15th International Parallel and Distributed Processing Symposium (IPDPS, p.51, 2001. ,
A Portable and Efficient Communication Library for High-Performance Cluster Computing, IEEE International Conference on Cluster Computing, pp.78-87, 2000. ,
, Ateliers internationaux avec comité de lecture
Scale Experiment for Topology-Aware Resource Management, Euro-Par 2017 : Parallel Processing Workshops -Euro-Par 2017 International Workshops, pp.179-186, 2017. ,
Aware Hierarchical and Distributed Load-Balancing in Charm++, First International Workshop on Communication Optimizations in HPC, COMHPC@SC 2016, pp.63-72, 2016. ,
High-Performance Multi-Rail Support with the NewMadeleine Communication Library, HCW 2007 : the Sixteenth International Heterogeneity in Computing Workshop, held in conjunction with IPDPS, p.9, 2007. ,
, Conférences françaises avec comité de lecture
Un algorithme de placement de processus sur architectures multicoeurs, Compass 2013, 2013. ,
,
Performance Evaluation of MPICH-Madeleine against the Multi-Protocol MPI Implementations for Homogeneous and Heterogeneous SMP Clusters », Proc. 5th IEEE/ACM International Symposium on Cluster Computing and the Grid, 2005. ,
, Rapports de recherche
Hardware topology management in MPI applications through hierarchical communicators, 2018. ,
Process Placement in Multicore Clusters : Algorithmic Issues and Practical Techniques, 2013. ,
Data Transfer in a SMP System : Study and Application to MPI, 2005. ,
Design and Evaluation of Nemesis : a Scalable, Low-Latency, Message-Passing Communication Subsystem, 2005. ,
MPICH-Madeleine : a True Multi-Protocol MPI for High-Performance Networks, 2000. ,
A Portable and Efficient Communication Library for High-Performance Cluster Computing, 2000. ,
,
Performance Portable Communication in Hierarchical, Heterogeneous and Dynamical Environments ». In french only, 2004. ,
Routing Algorithm Aware Hierarchical Task Mapping, International Conference for High Performance Computing, Networking, Storage and Analysis, pp.325-335, 2014. ,
, Topology-aware task mapping for reducing communication contention on large parallel machines, pp.25-29, 2006.
Aware Mappings for Large-Scale Eigenvalue Problems, Euro-Par 2012 Parallel Processing -18th International Conference. T. 7484, pp.830-842, 2012. ,
Scalable Node Allocation for Improved Performance in Regular and Anisotropic 3D Torus Supercomputers, EuroMPI 2011, pp.61-70, 2011. ,
Sous la dir. de Pekka MANNINEN et Per ÖSTER, Applied Parallel and Scientific Computing, pp.978-981, 2013. ,
« An efficient heuristic procedure for partitioning graphs, Bell System Technical Journal, vol.49, issue.2, pp.291-307, 1970. ,
« A network-topology independent task allocation strategy for parallel computers, Proceedings Supercomputing '90, pp.878-887, 1990. ,
, NAS Parallel Benchmark Results. Rapp. tech. 94-006. RNR, 1994.
Mapping communication layouts to network hardware characteristics on massive-scale blue gene systems, Computer Science -R&D, vol.26, pp.247-256, 2011. ,
Running Parallel Applications with Topology-Aware Grid Middleware, Fifth International Conference on e-Science, pp.292-299, 2009. ,
URL : https://hal.archives-ouvertes.fr/hal-00684522
, On Mapping Parallel Algorithms into Parallel Architectures, vol.5, pp.90018-90027, 1987.
« A survey of MPI usage in the US exascale computing project, Concurrency and Computation : Practice and Experience 0.0 (). e4851 cpe, vol.4851 ,
Optimizing task layout on the Blue Gene/L supercomputer, IBM Journal of Research and Development, vol.49, pp.489-500, 2005. ,
Topology aware task mapping techniques : an api and case study, pp.301-302, 2009. ,
Optimizing communication for Charm++ applications by reducing network contention, Concurrency and Computation : Practice and Experience, vol.23, pp.211-222, 2011. ,
Benefits of Topology Aware Mapping for Mesh Interconnects, Parallel Processing Letters, vol.18, pp.549-566, 2008. ,
, SC Conference on High Performance Computing Networking, Storage and Analysis, SC '12, p.97, 2012.
Optimizing the performance of parallel applications on a 5D torus via task mapping, 21st International Conference on High Performance Computing, HiPC, pp.1-10, 2014. ,
Topology-Aware Parallel Sparse Matrix Vector Multiplication, 2016. ,
On the Mapping Problem, IEEE Transactions on Computers, vol.30, pp.207-214, 1981. ,
Heuristic Technique for Processor and Link Assignment in Multicomputers, IEEE Trans. Comput, vol.40, pp.325-333, 1991. ,
The Zoltan and Isorropia parallel toolkits for combinatorial scientific computing : Partitioning, ordering and coloring, Scientific Programming, vol.20, pp.129-150, 2012. ,
, Revised Selected Papers. T. 10659. Lecture Notes in Computer Science, pp.157-166, 2017.
, 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp.523-532, 2018.
Rank Reordering for MPI Communication Optimization, Computer & Fluids 80 (juil. 2013), pp.372-380 ,
, Operating System Support for Efficient Data Sharing Among Processes on a Multi-Core Processor, 2008.
A Portable Programming Interface for Performance Evaluation on Modern Processors, The International Journal of High Performance Computing Applications, vol.14, pp.189-204, 2000. ,
« Topology Aware Scheduling in the LSF Distributed Resource Manager, Proceedings of the Cray User Group Meeting, 2001. ,
« A batch scheduler with high level components, Cluster computing and Grid 2005 (CCGrid05), 2005. ,
, Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.12, 2000.
Partitioning Tool for Hypergraphs) ». In : Encyclopedia of Parallel Computing, pp.1479-1487, 2011. ,
« Designing An Efficient Kernel-level and User-level Hybrid Approach for MPI Intra-node Communication on Multi-core Systems, Proceedings of the IEEE International Conference on Parallel Processing (ICPP-2008), 2008. ,
,
« Topologyaware tile mapping for clusters of SMPs, Proceedings of the Third Conference on Computing Frontiers, pp.383-392, 2006. ,
MPIPP : an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters, Proceedings of the 20th Annual International Conference on Supercomputing, pp.1-59593, 2006. ,
Predicting the Effect of Mapping on the Communication Performance of Large Multicomputers, Proceedings of the International Conference on Parallel Processing, ICPP '91, vol.II, pp.1-4, 1991. ,
Exploitation efficace des architectures parallèles de type grappes de NUMA à l aide de modèles hybrides de programmation, 2012. ,
Cross Memory Attach, 2010. ,
Executing a hybrid MPI/OpenMP job in batch under the Intel environment, 2016. ,
Virtual Interface Architecture Specification V 1.0, 1997. ,
,
, Euro-Par 2009 Parallel Processing, 15th International Euro-Par Conference, pp.466-477, 2009.
,
Performance Measurement and Analysis Tool, 2017. ,
A Task Mapping Algorithm to Improve Communication and Load Balancing in Clusters of Multicore Systems, ACM Transactions on Parallel Computing, vol.5, p.24, 2019. ,
« Reducing the bandwidth of sparse symmetric matrices, Proceedings of the 1969 24th national conference. ACM '69, pp.157-172, 1969. ,
A profile based approach for topology aware MPI rank placement, 2007. ,
A Threads-Only MPI Implementation for the Development of Parallel Programs, Proceedings of the 11th International Symposium on High Performance Computing Systems (HPCS'97), pp.153-163, 1997. ,
Design of ion-implanted MOSFET's with very small physical dimensions, IEEE Solid-State Circuits Society Newsletter, vol.12, issue.1, pp.38-50, 1974. ,
Exploiting Geometric Partitioning in Task Mapping for Parallel Computers, 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp.27-36, 2014. ,
, 2015 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2015, pp.197-206, 2015.
, Hypergraph partitioning for multiple communication cost metrics : Model and methods, vol.77, pp.69-83, 2015.
« Zoltan Data Management Services for Parallel Dynamic Applications, Computing in Science and Engineering, vol.4, pp.90-97, 2002. ,
Characterizing communication and page usage of parallel applications for thread and data mapping, Perform. Eval, vol.88, pp.18-36, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01146859
The International Exascale Software Project : A Call To Cooperative Action By the Global High-Performance Community, Int. J. High Perform. Comput. Appl, vol.23, pp.1094-3420, 2009. ,
« Mapping Algorithms for Multiprocessor Tasks on Multi-Core Clusters, 2008 International Conference on Parallel Processing, pp.141-148, 2008. ,
« Heterogeneous MPI Application Interoperation and Process Management under PVMPI, Recent Advances in Parallel Virtual Machine and Message Passing Interface, Proceedings of the 4th European PVM/MPI Users' Group Meeting. T. 1332. Lecture Notes in Computer Science, pp.91-98, 1997. ,
Method and System for Optimizing Communication in MPI Programs for an Execution Environment, 2008. ,
A linear-time heuristic for improving network partitions, Proceedings of the 19th Design Automation Conference, DAC '82, pp.175-181, 1982. ,
MPI on the I-WAY : A Wide-Area, Multimethod Implementation of the Message Passing Interface ,
, Interconnect topology-aware resource assignment
, FUJITSU. Hardware Topology Aware MPI extensions on Fujitsu PRIMEHPC FX10, FX100, 2018.
Distributed Computing in a Heterogeneous Computing Environment ». In : Recent Advances in Parallel Virtual Machine and Message Passing Interface. Sous la dir. de Vassil ALEXANDROV et Jack DONGARRA, Lecture Notes in Computer Sciences, pp.180-188, 1998. ,
, Recent Advances in Parallel Virtual Machine and Message Passing Interface, 11th European PVM/MPI Users' Group Meeting, pp.97-104, 2004.
Automatic topology mapping of diverse large-scale parallel applications, Proceedings of the International Conference on Supercomputing, vol.17, pp.1-17, 2017. ,
A Multithread-Safe Implementation of MPI, Recent Advances in PVM and MPI. 6th PVM/MPI European User's Group Meeting. T. 1697. Lecture Notes in Computer Science, pp.207-214, 1999. ,
Making MPI Interoperable, Journal of research of the National Institute of Standards and Technology, vol.105, pp.343-348, 2000. ,
Making the Case for Portable MPI Process Pinning, Poster presented at the 25th European MPI Users' Group Meeting, 2018. ,
« Towards generic Communication Mechanisms and better Affinity Management in Clusters of Hierarchical Nodes ». Habilitation à diriger des recherches ,
Towards a Comprehensive View of the HPC System Topology, 43rd International Conference on Parallel Processing Workshops, pp.216-225, 2014. ,
« Dodging Non-Uniform I/O Access in Hierarchical Collective Operations for Multicore Clusters, CASS 2011 : The 1st Workshop on Communication Architecture for Scalable Systems, held in conjunction with IPDPS, 2011. ,
Generic and Scalable Kernel-Assisted Intra-node MPI Communication Framework, Journal of Parallel and Distributed Computing (JPDC) 73.2 (fév. 2013), pp.176-188 ,
, Recent Advances in Parallel Virtual Machine and Message Passing Interface, 15th European PVM/MPI Users' Group Meeting, pp.130-140, 2008.
Learning from the Success of MPI, High Performance Computing -HiPC, pp.81-94, 2001. ,
A New Start for MPI Implementations, Recent Advances in Parallel Virtual Machine and Message Passing Interface, 9th European PVM/MPI Users' Group Meeting, p.7, 2002. ,
Using Node Information to Implement MPI Cartesian Topologies, Proceedings of the 25th European MPI Users' Group Meeting, vol.18, 2018. ,
Application-oriented adaptive MPI_Bcast for grids, 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), Proceedings, pp.25-29, 2006. ,
, Accommodating Thread-Level Heterogeneity in Coupled Parallel Applications, pp.469-478, 2017.
« State-of-the-Art Sharing-Aware Mapping Methods ». In : Thread and Data Mapping for Multicore Systems : Improving Communication and Memory Accesses, Springer International Publishing, pp.35-48, 2018. ,
Rank Reordering Strategy for MPI Topology Creation Functions ». In : Recent Advances in Parallel Virtual Machine and Message Passing Interface. Sous la dir. de V. ALEXANDROV et J. DONGARRA. T. 1497. Lecture Notes in Computer Science, pp.188-195, 1998. ,
Using OpenMP at NERSC, Présentation invitée à la OpenMPCon 2015. (cf. diapositive numéro 20.) Sept. 2015 ,
, , 2013.
The Chaco User's Guide : Version 2.0. Rapp. tech. SAND94-2692, 1994. ,
, , 2011.
, Generic Topology Mapping Strategies for Large-Scale Parallel Architectures, pp.75-84, 2011.
, 23rd IEEE International Symposium on Parallel and Distributed Processing, pp.1-8, 2009.
The scalable process topology interface of MPI 2.2, Concurrency and Computation : Practice and Experience, vol.23, pp.293-310, 2011. ,
MPI : a new hybrid approach to parallel programming with MPI plus shared memory, Computing 95, vol.12, pp.1121-1136, 2013. ,
Leveraging Runtime Infrastructure to Increase Scalability of Applications at Exascale, Proceedings of the 23rd European MPI Users' Group Meeting, pp.121-129, 2016. ,
Revised Papers. T. 2958. Lecture Notes in Computer Science, Languages and Compilers for Parallel Computing, 16th International Workshop, LCPC 2003, pp.306-322, 2003. ,
Advancing application process affinity experimentation : open MPI's LAMA-based affinity interface, 20th European MPI Users's Group Meeting, EuroMPI '13, pp.163-168, 2013. ,
Aware Parallel Process Mapping for Multi-core HPC Systems, 2011 IEEE International Conference on Cluster Computing (CLUSTER), pp.527-531, 2011. ,
MPI-StarT : delivering network performance to numerical applications, Proceedings of the 1998 ACM/IEEE conference on Supercomputing (CDROM), pp.1-15, 1998. ,
An Architecture of Stampi : MPI Library on a Cluster of Parallel Computers, Proceedings of EuroPVM/MPI2000. T. 1908. Lecture Notes in Computer Science, pp.200-207, 2000. ,
Using MPI Tuner for Intel R MPI Library on Linux* OS, 2016. ,
, , 2019.
NB : cet article n'est disponible qu'en langue japonaise), Cluster Computing Research Center, 2003. ,
« Automatically Optimized Core Mapping to Subdomains of Domain Decomposition Method on Multicore Parallel Environments, Computer & Fluids (avr. 2012 ,
Improving MPI Application Communication Time with an Introspection Monitoring Library, p.23, 2019. ,
Support for High-Performance MPI Intra-Node Communication on Linux Cluster, Proceedings of the IEEE International Conference on Parallel Processing (ICPP-2005), 2005. ,
Lightweight Kernel-Level Primitives for High-Performance MPI Intra-Node Communication over Multi-Core Systems, Proceedings of the IEEE International Conference on Cluster Computing (Cluster'07, 2007. ,
Approximation Algorithms for the Weighted Independent Set Problem, LNCS, vol.3787, pp.341-350, 2005. ,
A Grid-Enabled Implementation of the Message Passing Interface, Journal of Parallel and Distributed Computing. T. 63. 5. Mai, pp.551-563, 2003. ,
Exploiting Hierarchy in Parallel Computer Networks to Optimize Collective Operation Performance, Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), pp.377-384, 2000. ,
METIS -Unstructured Graph Partitioning and Sparse Matrix Ordering System, Version 2.0. Rapp. tech, 1995. ,
MPI's Collective Communication Operations for Clustered Wide Area Systems, Proceedings of the 1999 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP'99), pp.131-140, 1999. ,
,
A Heterogeneous Computing Environment to Solve the 768-bit RSA Challenge, Cluster Computing, 2010. ,
Sous la dir. de Tal RABIN. T. 6223. Lecture Notes in Computer Science. The original publication is available at www.springerlink.com, CRYPTO 2010, pp.333-350, 2010. ,
, Jahresber. Deutsch. Math. -Verein, vol.300, 1955.
« Blue Waters and Resource Management -Now and in the Future, Présentation à la MoabCon. (cf. diapositive numéro 14, 2013. ,
Designing truly one-sided MPI-2 RMA intra-node communication on multi-core systems, Proceedings of the International Supercomputing Conference (ISC'10), 2010. ,
Intégration de la parallélisation mixte MPI-OpenMP dans les configurations de l'IPSL, 2016. ,
Towards a Message-Passing Library for Heterogeneous Networks of Computers, 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), p.102, 2003. ,
Towards a message-passing library for heterogeneous networks of computers, Journal of Parallel and Distributed Computing, vol.66, issue.2, 2006. ,
, 23rd IEEE International Conference on Parallel and Distributed Systems, pp.710-719, 2017.
, Topology-aware Resource Allocation for Data-intensive Workloads ». In : APSys '10, pp.1-6, 2010.
A Mapping Strategy for Parallel Processing, IEEE Trans. Comput, vol.36, pp.433-442, 1987. ,
« mpibind : a memory-centric affinity algorithm for hybrid applications, Proceedings of the International Symposium on Memory Systems, MEMSYS 2017, pp.262-264, 2017. ,
, Topology-Aware Process Mapping on Clusters Featuring NUMA and Hierarchical Network, pp.74-81, 2013.
, Topology-aware Job Allocation in 3D Torus-based HPC Systems with Hard Job Priority Constraints, pp.515-524, 2017.
Aware Scheduling on Blue Waters with Proactive Queue Scanning and Migration-Based Job Placement, Job Scheduling Strategies for Parallel Processing, pp.978-981, 2017. ,
NUMA-aware shared-memory collective communication for MPI, The 22nd International Symposium on High-Performance Parallel and Distributed Computing, HPDC'13, pp.85-96, 2013. ,
Design and Evaluation of a Resource Selection Framework for Grid Applications, Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing. HPDC '02, p.63, 2002. ,
« Formal modeling and performance evaluation of a run-time rank remapping technique in Broadcast, Allgather and Allreduce MPI collective operations, Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp.963-972, 2017. ,
, Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, pp.187-194, 2007.
, 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp.586-595, 2015.
High Performance Design and Implementation of Nemesis Communication Layer for Two-Sided and One-Sided MPI Semantics in MVAPICH2, 39th International Conference on Parallel Processing, pp.377-386, 2010. ,
MPICH Working Note : The Second-Generation ADI for the MPICH Implementation of MPI, 1996. ,
An Approach for Matching Communication Patterns in Parallels Applications, Proceedings of 23rd IEEE International Parallel and Distributed Processing Symposium (IPDPS'09), 2009. ,
, Recent Advances in the Message Passing Interface -17th European MPI Users' Group Meeting, pp.265-274, 2010.
, 2011 IEEE International Conference on Cluster Computing (CLUSTER), pp.196-204, 2011.
, HierKNEM : An Adaptive Framework for Kernel-Assisted and Topology-Aware Collective Communications on Many-core Clusters, pp.970-982, 2012.
Kernel-assisted and topology-aware MPI collective communications on multicore/many-core platforms, J. Parallel Distrib. Comput, vol.73, pp.1000-1010, 2013. ,
« Efficient SMP-aware MPI-level broadcast over Infini-Band's hardware multicast, 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), Proceedings, pp.25-29, 2006. ,
A Topology-Aware MPI Process Placement Algorithm for Multi-core Clusters, Intelligent Computing and Information and Communication. Sous la dir, pp.67-76, 2018. ,
« Efficient MPI Collective Operations for Clusters in Longand-Fast Networks, Proceedings of the 2006 IEEE International Conference on Cluster Computing, 2006. ,
« Static Approximation of MPI Communication Graphs for Optimized Process Placement, Languages and Compilers for Parallel Computing. Sous la dir, pp.978-981, 2015. ,
, Algorithms for Scalable Synchronization on Shared-Memeory Multiprocessors, vol.9, pp.21-65, 1991.
Hierarchical Task Placement to Enable a Tapered Fat Tree Topology for Lower Power and Cost in HPC Networks, Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp.228-237, 2017. ,
A Parallel Topology-and Routing-Aware Mapping Framework for Large-Scale HPC Systems, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). Mai, pp.386-396, 2016. ,
Aware Rank Reordering for MPI Collectives, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPS Workshops, pp.1759-1768, 2016. ,
Cramming more components onto integrated circuits, IEEE Solid-State Circuits Society Newsletter, vol.11, issue.3, pp.33-35, 1965. ,
Optimizing MPI communication within large multicore nodes with kernel assistance, 24th IEEE International Symposium on Parallel and Distributed Processing, pp.1-7, 2010. ,
,
,
Parallel Multithreaded Machine. A Computing Environment for Distributed Architectures, Parallel Computing : State-of-the-Art and Perspectives, Proceedings of the conference ParCo, pp.0-444, 1995. ,
Reducing complexity in tree-like computer interconnection networks, Parallel Computing, vol.36, pp.71-85, 2010. ,
, Methods to Check Process and Thread Affinity, 2019.
Pthreads Programming, pp.1-56592, 1996. ,
Topology aware Cartesian grid mapping with MPI. Poster at EuroMPI, 2018. ,
,
On the development of a communicationaware task mapping technique, Journal of Systems Architecture, vol.50, pp.207-220, 2004. ,
VMI 2.0 : A Dynamically Reconfigurable Messaging Layer for Availability, Usability and Management, 2000. ,
Revised Papers. T. 5798. Lecture Notes in Computer Science, Effects of Topology-Aware Allocation Policies on Scheduling Performance ». In : Job Scheduling Strategies for Parallel Processing, 14th International Workshop, pp.138-156, 2009. ,
,
Graph Partitioning Software : An Overview, Combinatorial Scientific Computing. Sous la dir, pp.373-406, 2012. ,
, Recent Advances in Parallel Virtual Machine and Message Passing Interface, 16th European PVM/MPI Users' Group Meeting, pp.94-103, 2009.
A parallel SCRIP interpolation library for OASIS, 2018. ,
Enabling hierarchy-aware MPI collectives in dynamically changing topologies, Proceedings of the 24th European MPI Users' Group Meeting, vol.2, pp.1-2, 2017. ,
Revisiting locality-awareness in view of dynamically changing topologies, Parallel Computing, vol.77, pp.1-18, 2018. ,
, Portable Linux Processor Affinity (PLPA)
Message Passing Interface Library for Inhomogeneous Coupled Clusters, Proceedings of ACM/IEEE International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003. ,
« A uGNI-Based MPICH2 Nemesis Network Module for the Cray XE, Recent Advances in the Message Passing Interface -18th European MPI Users' Group Meeting, 2011. ,
, , pp.978-981, 2011.
Job Placement and Network Routing on Fat-Tree Systems, The 47th International Conference on Parallel Processing, vol.36, 2018. ,
, Hierarchical Parallel Matrix Multiplication on Large-Scale Distributed Memory Platforms ». In : 42nd International Conference on Parallel Processing, pp.754-762, 2013.
Interoperable high-performance MPI combining different vendor's MPI worlds, Rapp. tech. Avr, 1998. ,
« Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes, Proceedings of the 17th ,
, Euromicro International Conference on Parallel, Distributed and Network-Based Processing, pp.427-436, 2009.
More Message Passing Performance with the Multithreaded MPICH Device, 1997. ,
, Distributed Resource Management for High Throughput Computing ». In : HPDC'7, pp.28-31, 1998.
, Multi-core and Network Aware MPI Topology Functions ». In : Recent Advances in the Message Passing Interface -18th European MPI Users' Group Meeting, pp.50-60, 2011.
Multicore aware process mapping and its impact on communication overhead of parallel applications, Proceedings of the IEEE Symposium on Computers and Communications, pp.811-817, 2009. ,
Issues in the Study of Graph Embeddings, WG '80 : Proceedings of the International Workshop on Graphtheoretic Concepts in Computer Science, pp.3-540, 1981. ,
, Building Portable Thread Schedulers for Hierarchical Multiprocessors : the BubbleSched Framework, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00154506
Chap. Policy-Based Resource Assignment in Utility Computing Environments, pp.100-111, 2004. ,
« Parallel Multilevel Algorithms for Multi-constraint Graph Partitioning (Distinguished Paper), Proceedings from the 6th International Euro-Par Conference on Parallel Processing, pp.3-540, 2000. ,
Sous la dir, Leibniz International Proceedings in Informatics (LIPIcs). Dagstuhl, Germany : Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, vol.4, 2017. ,
A Lightweight Low-Level Threading and Tasking Framework, IEEE Trans. Parallel Distrib. Syst, vol.29, pp.512-526, 2018. ,
, High Performance Computing for Computational Science -VECPAR 2010. Sous la dir. de José M, pp.978-981, 2011.
« How Good is Recursive Bisection ? » In : SIAM, sept. 1997), vol.18, pp.1436-1445 ,
, Euro-Par 2005, Parallel Processing, 11th International Euro-Par Conference, pp.1005-1013, 2005.
« Development of mixed mode MPI / OpenMP applications, Scientific Programming, vol.9, pp.83-98, 2001. ,
« Communication-Aware Job Placement Policies for the KOALA Grid Scheduler, Proc. of the Second IEEE International Conference on e-Science and Grid Computing (e-Science'06), pp.0-7695, 2006. ,
« A Novel Process Mapping Strategy in Clustered Environments, 2012. ,
« Improving internode communications in multi-core clusters using a contention-free process mapping algorithm, The Journal of Supercomputing, vol.66, pp.488-513, 2013. ,
, Recent Advances in Parallel Virtual Machine and Message Passing Interface,10th European PVM/MPI Users' Group Meeting, pp.379-387, 2003.
Placement of communicating processes on multiprocessors networks, pp.4122012-143033166, 1985. ,
Design of a Scalable Infiniband Topology Service to Enable Network-Topology-Aware Placement of Processes, Proceedings of the 2012 ACM/IEEE conference on Supercomputing (CDROM), p.12, 2012. ,
« Optimization of the hop-byte metric for effective topology aware mapping, High Performance Computing (HiPC), 2012 19th International Conference on, pp.1-9, 2012. ,
Optimizing threaded MPI execution on SMP clusters, Proceedings of the 15th international conference on Supercomputing, pp.381-392, 2001. ,
Algorithm for Mapping Communicating Tasks on Heterogeneous Resources, 9th Heterogeneous Computing Workshop, pp.102-115, 2000. ,
« Placement d'applications parallèles en fonction de l'affinité et de la topologie, 2015. ,
, CD-ROM, Jesper Larsson TRÄFF. « Implementing the MPI process topology mechanism, vol.40, 2002.
, Recent Advances in Parallel Virtual Machine and Message Passing Interface, 9th European PVM/MPI Users' Group Meeting, pp.392-400, 2002.
, Eighth International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS'03), pp.56-65, 2003.
« Direct graph k-partitioning with a Kernighan-Lin like heuristic, Oper. Res. Lett, vol.34, issue.6, pp.621-629, 2006. ,
Collectives and Datatypes for Hierarchical All-to-all Communication, 21st European MPI Users' Group Meeting, EuroMPI/ASIA '14, p.27, 2014. ,
A multithreaded communication engine for multicore architectures, 22nd IEEE International Symposium on Parallel and Distributed Processing, pp.1-7, 2008. ,
Integrating New Capabilities into NetPIPE, Recent Advances in Parallel Virtual Machine and Message Passing Interface,10th European PVM/MPI Users' Group Meeting, pp.37-44, 2003. ,
, Poster presented at the 25th European MPI Users' Group Meeting, 2018.
Migrating from PVM to MPI, part I : the Unify System, 5th Symposium on the Frontiers of Massively Parallel Computation. Sous la dir. d'IEEE Computer Society Technical Committee on COMPUTER ARCHITECTURE. 12. McLean, pp.488-495, 1995. ,
Optimized process placement for collective I/O operations, 20th European MPI Users's Group Meeting, EuroMPI '13, pp.31-36, 2013. ,
« Fast Approximate Quadratic Programming for Graph Matching, PLoS One, vol.10, 2015. ,
, Locality Conscious Processor Allocation and Scheduling for Mixed Parallel Applications, 2006.
Minimising Communication Costs on a SMP Cluster using Process Placement, 2005. ,
Parallel Multilevel Graph-Partitioning Software -An Overview, Mesh Partitioning Techniques and Domain Decomposition Techniques. Sous la dir. de F. MAGOULES, pp.978-979, 2007. ,
« Hierarchical task mapping for parallel applications on supercomputers, The Journal of Supercomputing, vol.71, pp.1776-1802, 2015. ,
« Hierarchical task mapping for parallel applications on supercomputers, The Journal of Supercomputing, vol.71, pp.1776-1802, 2015. ,
, YAMPII
Balancing job performance with system performance via localityaware scheduling on torus-connected systems, Cluster'2014, pp.140-148, 2014. ,
Simple Linux Utility for Resource Management, Job Scheduling Strategies for Parallel Processing, pp.44-60, 2003. ,
Blue Gene System Software -Topology Mapping for Blue Gene/L Supercomputer, Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006. ,
New Process Placement Algorithm in Multi-core Clusters Aimed to Reducing Network Interface Contention, Advances in Computer Science, Engineering & Applications. Sous la dir. de David C. WYLD, Jan ZIZKA et Dhinaharan NAGAMALAI, pp.978-981, 2012. ,
PHANTOM : predicting performance of parallel applications on large-scale parallel machines using a single node, Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp.978-979, 2010. ,
FACT : fast communication trace collection for parallel applications through program slicing, Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009. ,
Combining Static and Dynamic Analysis for Top-Down Communication Trace Compression, International Conference for High Performance Computing, Networking, Storage and Analysis, pp.143-153, 2014. ,
The Netherlands, Euro-Par 2009 Parallel Processing, 15th International Euro-Par Conference, 2009. ,
, , pp.978-981, 2009.
Hierarchical Collectives in MPICH2, Recent Advances in Parallel Virtual Machine and Message Passing Interface, 16th European PVM/MPI Users' Group Meeting, pp.325-326, 2009. ,