R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall et al., Cilk: an efficient multithreaded runtime system, Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming, ser. PPOPP '95, pp.207-216, 1995.
DOI : 10.1006/jpdc.1996.0107

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.3175

L. Kale and S. Krishnan, CHARM++: A Portable Concurrent Object Oriented System Based on C++, Proceedings of Object-Oriented Programming, Systems, Languages and Applications (OOPSLA) 93, pp.91-108, 1993.

L. V. Kale and G. Zheng, Charm++ and AMPI: Adaptive Runtime Strategies via Migratable Objects, Advanced Computational Infrastructures for Parallel and Distributed Applications, pp.265-282, 2009.
DOI : 10.1002/9780470558027.ch13

B. Brandfass, T. Alrutz, and T. Gerhold, Rank reordering for MPI communication optimization, Computers & Fluids, vol.80, 2012.
DOI : 10.1016/j.compfluid.2012.01.019

H. Chen, W. Chen, J. Huang, B. Robert, and H. Kuhn, MPIPP, Proceedings of the 20th annual international conference on Supercomputing , ICS '06, pp.353-360, 2006.
DOI : 10.1145/1183401.1183451

E. Jeannot and G. Mercier, Near-Optimal Placement of MPI Processes on Hierarchical NUMA Architectures, Euro-Par 2010-Parallel Processing, pp.199-210, 2010.
DOI : 10.1007/978-3-642-15291-7_20

URL : https://hal.archives-ouvertes.fr/inria-00544346

G. Mercier and E. Jeannot, Improving MPI Applications Performance on Multicore Clusters with Rank Reordering, EuroMPI, ser. Lecture Notes in Computer Science, pp.39-49, 2011.
DOI : 10.1007/978-3-642-24449-0_7

URL : https://hal.archives-ouvertes.fr/hal-00643151

F. Broquedis, J. Clet-ortega, S. Moreaud, N. Furmento, B. Goglin et al., hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, 2010.
DOI : 10.1109/PDP.2010.67

URL : https://hal.archives-ouvertes.fr/inria-00429889

T. Hoefler and M. Snir, Generic topology mapping strategies for large-scale parallel architectures, Proceedings of the international conference on Supercomputing, ICS '11, pp.75-84, 2011.
DOI : 10.1145/1995896.1995909

T. Hoefler, R. Rabenseifner, H. Ritzdorf, B. R. De-supinski, R. Thakur et al., The scalable process topology interface of MPI 2.2, Concurrency and Computation: Practice and Experience, pp.293-310, 2010.
DOI : 10.1002/cpe.1643

J. Dümmler, T. Rauber, and G. Rünger, Mapping Algorithms for Multiprocessor Tasks on Multi-Core Clusters, 2008 37th International Conference on Parallel Processing, pp.141-148, 2008.
DOI : 10.1109/ICPP.2008.42

L. V. Kale and S. Krishnan, Charm++: Parallel Programming with Message-Driven Objects, pp.175-213, 1996.

K. Skeel and . Schulten, NAMD?a parallel, object-oriented molecular dynamics program, Intl. J. Supercomput. Applics. High Performance Computing, vol.10, issue.4, pp.251-268, 1996.

V. Mehta, LeanMD: A Charm++ framework for high performance molecular dynamics simulation on large parallel machines, 2004.

P. Jetley, F. Gioachin, C. Mendes, L. V. Kale, and T. R. Quinn, Massively parallel cosmological simulations with ChaNGa, 2008 IEEE International Symposium on Parallel and Distributed Processing, 2008.
DOI : 10.1109/IPDPS.2008.4536319

A. Bhatelé and L. V. Kalé, Benefits of Topology Aware Mapping for Mesh Interconnects, Parallel Processing Letters (Special issue on Large-Scale Parallel Processing), pp.549-566, 2008.
DOI : 10.1142/S0129626408003569

L. L. Pilla, C. P. Ribeiro, D. Cordeiro, C. Mei, A. Bhatele et al., A Hierarchical Approach for Load Balancing on Parallel Multi-core Systems, 2012 41st International Conference on Parallel Processing, pp.118-127, 2012.
DOI : 10.1109/ICPP.2012.9

URL : https://hal.archives-ouvertes.fr/hal-00788012

L. L. Pilla, P. O. Navaux, C. P. Ribeiro, P. Coucheney, F. Broquedis et al., Asymptotically Optimal Load Balancing for Hierarchical Multi-Core Systems, 2012 IEEE 18th International Conference on Parallel and Distributed Systems, pp.236-243, 2012.
DOI : 10.1109/ICPADS.2012.41

URL : https://hal.archives-ouvertes.fr/hal-00788008

F. Pellegrini, Static mapping by dual recursive bipartitioning of process architecture graphs, Proceedings of IEEE Scalable High Performance Computing Conference, pp.486-493, 1994.
DOI : 10.1109/SHPCC.1994.296682

G. Karypis and V. Kumar, METIS -Unstructured Graph Partitioning and Sparse Matrix Ordering System, Version 2.0, Tech. Rep, 1995.

F. Pellegrini, SCOTCH and LIBSCOTCH 5.1 User's Guide, ScAlApplix project, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00410332

S. Micali and V. V. Vazirani, An o( (v)e) algorithm for finding a maximum matching in general graphs, Proc. 21st Ann IEEE Symp, pp.17-27, 1980.