E. Ayguadé, B. Blainey, A. Duran, J. Labarta, F. Martínez et al., Is the Schedule Clause Really Necessary in OpenMP?, Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming, WOMPAT'03, pp.147-159, 2003.
DOI : 10.1007/3-540-45009-2_12

F. Broquedis, O. Aumage, B. Goglin, S. Thibault, P. Wacrenier et al., Structuring the execution of OpenMP applications for multicore architectures, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2010.
DOI : 10.1109/IPDPS.2010.5470442

URL : https://hal.archives-ouvertes.fr/inria-00441472

F. Broquedis, T. Gautier, and V. Danjean, libKOMP, an Efficient OpenMP Runtime System for Both Fork-Join and Data Flow Paradigms, Proceedings of the 8th international conference on OpenMP in a Heterogeneous World, 2012.
DOI : 10.1007/978-3-642-30961-8_8

URL : https://hal.archives-ouvertes.fr/hal-00796253

F. Broquedis, N. Furmento, B. Goglin, R. Namyst, and P. Wacrenier, Dynamic Task and Data Placement over NUMA Architectures: An OpenMP Runtime Perspective, International Workshop on OpenMP (IWOMP), 2009.
DOI : 10.1007/978-3-540-74466-5_6

URL : https://hal.archives-ouvertes.fr/inria-00367570

J. M. Bull, Measuring synchronisation and scheduling overheads in openmp, Proceedings of First European Workshop on OpenMP, pp.99-105, 1999.

S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer et al., Rodinia: A benchmark suite for heterogeneous computing, 2009 IEEE International Symposium on Workload Characterization (IISWC), pp.44-54, 2009.
DOI : 10.1109/IISWC.2009.5306797

M. Durand, B. Raffin, and F. Faure, A Packed Memory Array to Keep Moving Particles Sorted, 9th Workshop on Virtual Reality Interaction and Physical Simulation, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00762593

M. Frigo, C. E. Leiserson, and K. H. Randall, The implementation of the Cilk-5 multithreaded language, ACM SIGPLAN Notices, vol.33, issue.5, pp.212-223, 1998.
DOI : 10.1145/277652.277725

R. C. Hoetzlein, Fluids v2.0, open source, fluid simulator, 2008.

L. Huang, H. Jin, L. Yi, and B. Chapman, Enabling Locality-Aware Computations in OpenMP, Scientific Programming, vol.18, issue.3-4, pp.3-4169, 2010.
DOI : 10.1155/2010/185421

M. Ihmsen, N. Akinci, M. Becker, and M. Teschner, A Parallel SPH Implementation on Multi-Core CPUs, Computer Graphics Forum, vol.87, issue.1-2, pp.99-112, 2011.
DOI : 10.1111/j.1467-8659.2010.01832.x

A. Mahéo, S. Koliaï, P. Carribault, M. Pérache, and W. Jalby, Adaptive OpenMP for Large NUMA Nodes, Proceedings of the 8th international conference on OpenMP in a Heterogeneous World, pp.254-257, 2012.
DOI : 10.1007/978-3-642-30961-8_20

A. Marowka, Z. Liu, and B. Chapman, OpenMP-oriented applications for distributed shared memory architectures, Concurrency and Computation: Practice and Experience, vol.16, issue.4, 2004.
DOI : 10.1002/cpe.752

J. D. Mccalpin, Memory bandwidth and machine balance in current high performance computers, IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter, pp.19-25, 1995.

S. L. Olivier, B. R. De-supinski, M. Schulz, and J. F. Prins, Characterizing and mitigating work time inflation in task parallel programs, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC '12, pp.1-65, 2012.

S. L. Olivier, A. K. Porterfield, K. B. Wheeler, M. Spiegel, and J. F. Prins, OpenMP task scheduling strategies for multicore NUMA systems, International Journal of High Performance Computing Applications, vol.26, issue.2, pp.110-124, 2012.
DOI : 10.1177/1094342011434065

S. Subramaniam and D. L. Eager, Affinity scheduling of unbalanced workloads, Proceedings of the 1994 ACM/IEEE conference on Supercomputing, Supercomputing '94, pp.214-226, 1994.

M. Tchiboukdjian, V. Danjean, T. Gautier, F. L. Mentec, and B. Raffin, A Work Stealing Scheduler for Parallel Loops on Shared Cache Multicores, Proceedings of the 2010 conference on Parallel processing, pp.99-107, 2010.
DOI : 10.1007/978-3-540-85451-7_95

D. Traoré, J. Roch, N. Maillard, T. Gautier, and J. Bernard, Deque-Free Work-Optimal Parallel STL Algorithms, Proceedings of the 14th international Euro-Par conference on Parallel Processing, Euro-Par '08, pp.887-897, 2008.
DOI : 10.1007/978-3-540-85451-7_95

Y. Yan, C. Jin, and X. Zhang, Adaptively scheduling parallel loops in distributed sharedmemory systems, IEEE Trans. on Parallel and Distributed Systems, vol.8, issue.1, 1997.