H. Topcuoglu, S. Hariri, and M. Y. Wu, Performance-effective and low-complexity task scheduling for heterogeneous computing. Parallel and Distributed Systems, IEEE Transactions on, vol.13, issue.3, pp.260-274, 2002.

C. Augonnet, S. Thibault, R. Namyst, and P. A. Wacrenier, StarPU: a unified platform for task scheduling on heterogeneous multicore architectures, Concurrency and Computation: Practice and Experience, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00384363

I. Multicoreware, GMAC: Global Memory for Accelerator, TM: Task Manager, 2011.

J. Kim, H. Kim, J. H. Lee, and J. Lee, Achieving a single compute device image in opencl for multiple gpus, Proceedings of the 16th ACM symposium on Principles and practice of parallel programming. PPoPP '11, pp.277-288, 2011.

C. De-la-lama, P. Toharia, J. Bosque, and O. Robles, Static Multi-device Load Balancing for OpenCL, 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications, pp.675-682, 2012.
DOI : 10.1109/ISPA.2012.100

K. Spafford, J. Meredith, and J. Vetter, Maestro: Data Orchestration and Tuning for OpenCL Devices, Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II. Euro-Par'10, pp.275-286, 2010.
DOI : 10.1007/978-3-642-15291-7_26

J. Kim, S. Seo, J. Lee, J. Nah, G. Jo et al., SnuCL, Proceedings of the 26th ACM international conference on Supercomputing, ICS '12, pp.341-352, 2012.
DOI : 10.1145/2304576.2304623

D. Grewe and M. F. O-'boyle, A Static Task Partitioning Approach for Heterogeneous Systems Using OpenCL, CC '11: Proceedings of the 20th International Conference on Compiler Construction, 2011.
DOI : 10.1007/978-3-540-92990-1_4

R. Dolbeau, S. Bihan, and F. Bodin, HMPP: A hybrid Multi-core Parallel Programming Environment, 2007.

M. Wolfe, Implementing the PGI Accelerator model, Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, GPGPU '10, 2010.
DOI : 10.1145/1735688.1735697

D. Grewe, Z. Wang, and M. F. O-'boyle, Portable mapping of data parallel programs to OpenCL for heterogeneous systems, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2013.
DOI : 10.1109/CGO.2013.6494993

C. K. Luk, S. Hong, and H. Kim, Qilin, Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, Micro-42, pp.45-55, 2009.
DOI : 10.1145/1669112.1669121

E. Ayguadé, R. M. Badia, F. D. Igual, J. Labarta, R. Mayo et al., An Extension of the StarSs Programming Model for Platforms with Multiple GPUs, Proceedings of the 15th International Euro-Par Conference on Parallel Processing. Euro-Par '09, pp.851-862, 2009.
DOI : 10.1109/TPDS.2003.1214317

T. Gautier, X. Besseron, and L. Pigeon, KAAPI, Proceedings of the 2007 international workshop on Parallel symbolic computation, PASCO '07, pp.15-23, 2007.
DOI : 10.1145/1278177.1278182

URL : https://hal.archives-ouvertes.fr/hal-00647474

M. Boyer, K. Skadron, S. Che, and N. Jayasena, Load balancing in a changing world, Proceedings of the ACM International Conference on Computing Frontiers, CF '13, 2013.
DOI : 10.1145/2482767.2482794