Performance Tuning of x86 OpenMP Codes with MAQAO, Tools for High Performance Computing, p.95113, 2009. ,
DOI : 10.1007/978-3-642-11261-4_7
hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, 2010. ,
DOI : 10.1109/PDP.2010.67
URL : https://hal.archives-ouvertes.fr/inria-00429889
A scalable crossplatform infrastructure for application performance tuning using hardware counters, Proceedings of the 2000 ACM/IEEE Conference on Supercomputing. SC '00, 2000. ,
The Scalasca performance toolset architecture, Proc. of the International Workshop on Scalable Tools for High-End Computing (STHEC), p.5165, 2008. ,
DOI : 10.1002/cpe.1556
Dissecting On-Node Memory Access Performance: A Semantic Approach, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis, p.166176, 2014. ,
DOI : 10.1109/SC.2014.19
Locality-Aware Parallel Process Mapping for Multi-core HPC Systems, 2011 IEEE International Conference on Cluster Computing, p.527531, 2011. ,
DOI : 10.1109/CLUSTER.2011.59
Process placement in multicore clusters: Algorithmic issues and practical techniques. Parallel and Distributed Systems, IEEE Transactions on, vol.25, issue.4, p.9931002, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-00803548
The Vampir Performance Analysis Tool-Set, Proceedings of the 2nd International Workshop on Parallel Tools for High Performance Computing, p.139155, 2008. ,
DOI : 10.1007/978-3-540-68564-7_9
Pin: Building customized program analysis tools with dynamic instrumentation, Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation. pp. 190200. PLDI '05, 2005. ,
Memory management in numa multicore systems: Trapped between cache contention and interconnect overhead, SIGPLAN Not, vol.46, issue.11, p.1120, 2011. ,
PARAVER: A Tool to Visualize and Analyze Parallel Code, Proceedings of WoTUG-18: Transputer and occam Developments, p.1731, 1995. ,
Tiptop: Hardware Performance Counters for the Masses, 2012 41st International Conference on Parallel Processing Workshops, p.7789, 2011. ,
DOI : 10.1109/ICPPW.2012.58
URL : https://hal.archives-ouvertes.fr/hal-00639173
LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments, 2010 39th International Conference on Parallel Processing Workshops, p.207216, 2010. ,
DOI : 10.1109/ICPPW.2010.38
Addressing shared resource contention in multicore processors via scheduling, SIGPLAN Not, vol.45, issue.3, p.129142, 2010. ,