F. Broquedis, N. Furmento, B. Goglin, P. Wacrenier, and R. Namyst, ForestGOMP: An Efficient OpenMP Environment for NUMA Architectures, International Journal of Parallel Programming, vol.62, issue.5-6, pp.418-439, 2010.
DOI : 10.1007/s10766-010-0136-3

URL : https://hal.archives-ouvertes.fr/inria-00496295

F. Broquedis, T. Gautier, and V. Danjean, libKOMP, an Efficient OpenMP Runtime System for Both Fork-Join and Data Flow Paradigms, Proceedings of the 8th International Conference on OpenMP in a Heterogeneous World, pp.102-115, 2012.
DOI : 10.1007/978-3-642-30961-8_8

URL : https://hal.archives-ouvertes.fr/hal-00796253

F. Broquedis, J. Clet-ortega, S. Moreaud, N. Furmento, B. Goglin et al., hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, pp.180-186, 2010.
DOI : 10.1109/PDP.2010.67

URL : https://hal.archives-ouvertes.fr/inria-00429889

J. Clet-ortega, P. Carribault, and M. Pérache, Evaluation of OpenMP Task Scheduling Algorithms for Large NUMA Architectures, Euro-Par 2014 Parallel Processing -20th International Conference Proceedings, pp.596-607, 2014.
DOI : 10.1007/978-3-319-09873-9_50

A. Drebes, K. Heydemann, N. Drach, A. Pop, and A. Cohen, Topology-Aware and Dependence-Aware Scheduling and Memory Allocation for Task-Parallel Languages, ACM Transactions on Architecture and Code Optimization, vol.11, issue.3, pp.1-3025, 2014.
DOI : 10.1145/2641764

URL : https://hal.archives-ouvertes.fr/hal-01136491

M. Frigo, C. E. Leiserson, and K. H. Randall, The implementation of the Cilk-5 multithreaded language, ACM SIGPLAN Notices, vol.33, issue.5, pp.212-223, 1998.
DOI : 10.1145/277652.277725

T. Gautier, X. Besseron, and L. Pigeon, KAAPI, Proceedings of the 2007 international workshop on Parallel symbolic computation, PASCO '07, 2007.
DOI : 10.1145/1278177.1278182

URL : https://hal.archives-ouvertes.fr/hal-00647474

S. Olivier, A. Porterfield, K. B. Wheeler, M. Spiegel, and J. F. Prins, OpenMP task scheduling strategies for multicore NUMA systems, International Journal of High Performance Computing Applications, vol.26, issue.2, pp.110-124, 2012.
DOI : 10.1177/1094342011434065

S. L. Olivier, B. R. De-supinski, M. Schulz, and J. F. Prins, Characterizing and mitigating work time inflation in task parallel programs, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC '12, pp.1-65, 2012.

O. Tahan, Towards efficient openmp strategies for non-uniform architectures. CoRR, abs, 1411.

C. Terboven, D. Schmidl, T. Cramer, and D. Mey, Task-Parallel Programming on NUMA Architectures, Euro-Par 2012 Parallel Processing -18th International Conference Proceedings, volume 7484 of Lecture Notes in Computer Science, pp.638-649, 2012.
DOI : 10.1007/978-3-642-32820-6_63

P. Virouleau, P. Brunet, F. Broquedis, N. Furmento, S. Thibault et al., Evaluation of OpenMP Dependent Tasks with the KASTORS Benchmark Suite, 10th International Workshop on OpenMP, IWOMP2014, pp.16-29, 2014.
DOI : 10.1007/978-3-319-11454-5_2

URL : https://hal.archives-ouvertes.fr/hal-01081974

T. Weng and B. M. Chapman, Implementing openmp using dataflow execution model for data locality and efficient parallel execution, Proceedings of the 16th International Parallel and Distributed Processing Symposium, IPDPS '02, p.180, 2002.

M. Wittmann and G. Hager, Optimizing ccnuma locality for task-parallel execution under openmp and TBB on multicore-based systems, 2011.