E. Agullo, C. Augonnet, J. Dongarra, H. Ltaief, R. Namyst et al., Faster, Cheaper, Better a Hybridization Methodology to Develop Linear Algebra Software for, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00547847

M. Amini, B. Creusillet, S. Even, R. Keryell, O. Goubier et al., Par4all: From convex array regions to heterogeneous computing, 2nd International Workshop on Polyhedral Compilation Techniques, Impact, p.2012, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00744733

J. Ansel, C. Chan, Y. L. Wong, M. Olszewski, Q. Zhao et al., Petabricks: A language and compiler for algorithmic choice, ACM SIGPLAN Conference on Programming Language Design and Implementation, 2009.

C. Augonnet, S. Thibault, R. Namyst, and P. Wacrenier, StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures, Proceedings of the 15th International Euro-Par Conference, pp.863-874, 2009.
DOI : 10.1111/j.1467-8659.2007.01012.x

URL : https://hal.archives-ouvertes.fr/inria-00384363

C. Augonnet, S. Thibault, R. Namyst, and P. Wacrenier, Starpu: a unified platform for task scheduling on heterogeneous multicore architectures, Concurrency and Computation: Practice and Experience, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00384363

E. Ayguadé, R. Badia, F. Igual, J. Labarta, R. Mayo et al., An Extension of the StarSs Programming Model for Platforms with Multiple GPUs, Euro-Par 2009 Parallel Processing, pp.851-862, 2009.
DOI : 10.1109/TPDS.2003.1214317

O. Beaumont, L. Marchal, Y. Robert, and L. , de l'informatique du paralllisme . Scheduling divisible loads with return messages on heterogeneous master-worker platforms, 2005.

R. Bird, Lectures on Constructive Functional Programming, 1988.
DOI : 10.1007/978-3-642-74884-4_5

L. F. Bittencourt, R. Sakellariou, and E. R. Madeira, DAG Scheduling Using a Lookahead Variant of the Heterogeneous Earliest Finish Time Algorithm, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, pp.27-34, 2010.
DOI : 10.1109/PDP.2010.56

G. Bosilca, A. Bouteiller, A. Danalis, T. Herault, P. Lemarinier et al., DAGuE: A generic distributed DAG engine for High Performance Computing, Parallel Computing, vol.38, issue.1-2, pp.37-51, 2011.
DOI : 10.1016/j.parco.2011.10.003

M. Boyer, K. Skadron, S. Che, and N. Jayasena, Load balancing in a changing world, Proceedings of the ACM International Conference on Computing Frontiers, CF '13, p.21, 2013.
DOI : 10.1145/2482767.2482794

K. J. Brown, A. K. Sujeeth, H. J. Lee, T. Rompf, H. Chafi et al., A heterogeneous parallel framework for domainspecific languages, Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on, pp.89-100, 2011.

M. M. Chakravarty, G. Keller, S. Lee, T. L. Mcdonell, and V. Grover, Accelerating Haskell array codes with multicore GPUs, Proceedings of the sixth workshop on Declarative aspects of multicore programming, DAMP '11, pp.3-14, 2011.
DOI : 10.1145/1926354.1926358

I. Christadler and V. Weinberg, RapidMind: Portability across Architectures and Its Limitations
DOI : 10.1177/1094342009106195

URL : http://arxiv.org/abs/1001.1902

M. Cosnard and M. Loi, Automatic task graph generation techniques, System Sciences II. Proceedings of the Twenty-Eighth Hawaii International Conference on, pp.113-122, 1995.
DOI : 10.1109/hicss.1995.375471

L. Courtès, C language extensions for hybrid cpu/gpu programming with starpu. arXiv preprint

R. Dolbeau, S. Bihan, and F. Bodin, Hmpp: A hybrid multi-core parallel programming environment, 2007.

C. Elliott, Programming graphics processors functionally, Proceedings of the ACM SIGPLAN workshop on Haskell , Haskell '04, pp.45-56, 2004.
DOI : 10.1145/1017472.1017482

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.169.5989

T. Gautier, J. V. Lima, N. Maillard, and B. Raffin, XKaapi: A Runtime System for Data-Flow Task Programming on Heterogeneous Architectures, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing
DOI : 10.1109/IPDPS.2013.66

URL : https://hal.archives-ouvertes.fr/hal-00799904

A. Ghuloum, E. Sprangle, J. Fang, G. Wu, and X. Zhou, Ct, Proceedings of the 4th ACM SIGPLAN workshop on Commercial users of functional programming , CUFP '07, 2007.
DOI : 10.1145/1362702.1362707

R. L. Graham, E. L. Lawler, J. K. Lenstra, and A. R. Kan, Optimization and Approximation in Deterministic Sequencing and Scheduling: a Survey, Annals of Discrete Mathematics, vol.5, pp.287-326, 1977.
DOI : 10.1016/S0167-5060(08)70356-X

D. Grewe, Z. Wand, and M. F. O-'boyle, Portable mapping of data parallel programs to OpenCL for heterogeneous systems, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2013.
DOI : 10.1109/CGO.2013.6494993

O. Group, The openacc application programming interface, 2011.

K. Hammond, Parallel functional programming: An introduction, International Symposium on Parallel Symbolic Computation, p.9, 1994.

J. Launchbury and S. L. Jones, State in Haskell, Lisp and Symbolic Computation, pp.293-341, 1995.
DOI : 10.1007/BF01018827

R. Loogen, Y. Ortega-mallén, and R. Peña-marí, Parallel functional programming in Eden, Journal of Functional Programming, vol.15, issue.3, pp.431-476, 2005.
DOI : 10.1017/S0956796805005526

G. Mainland and G. Morrisett, Nikola: embedding compiled gpu functions in haskell, Proceedings of the third ACM Haskell symposium on Haskell, pp.67-78, 2010.

S. Marlow, Parallel and Concurrent Programming in Haskell, CEFP 2011, pp.339-401
DOI : 10.1007/978-3-642-04652-0_6

M. D. Mccool and S. D. Toit, Metaprogramming GPUs with Sh, 2004.

T. L. Mcdonell, M. M. Chakravarty, G. Keller, and B. Lippmeier, Optimising purely functional gpu programs, ICFP, 2013.

I. Multicoreware, Gmac: Global memory for accelerator, tm: Task manager, 2011.

C. Newburn, B. So, Z. Liu, M. Mccool, A. Ghuloum et al., Intel's Array Building Blocks: A retargetable, dynamic compiler and embedded language, International Symposium on Code Generation and Optimization (CGO 2011), pp.224-235, 2011.
DOI : 10.1109/CGO.2011.5764690

S. L. , P. Jones, and D. R. Lester, Implementing functional languages: a tutorial, 1992.

J. Planas, R. M. Badia, E. Ayguadé, and J. Labarta, Hierarchical Task-Based Programming With StarSs, International Journal of High Performance Computing Applications, vol.23, issue.3, pp.284-299, 2009.
DOI : 10.1177/1094342009106195

URL : http://hdl.handle.net/2117/28379

R. Plasmeijer, M. Van-eekelen, and M. Plasmeijer, Functional programming and parallel graph rewriting, 1993.

S. Ranaweera and D. P. , A task duplication based scheduling algorithm for heterogeneous systems, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000, pp.445-450, 2000.
DOI : 10.1109/IPDPS.2000.846020

M. C. Rinard and M. S. Lam, The design, implementation, and evaluation of Jade, ACM Transactions on Programming Languages and Systems, vol.20, issue.3, pp.483-545, 1998.
DOI : 10.1145/291889.291893

P. Roe and A. Wendelborn, Implicit array copying: Prevention is better than cure, 1992.

R. Sakellariou and H. Zhao, A hybrid heuristic for DAG scheduling on heterogeneous systems, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings., p.111, 2004.
DOI : 10.1109/IPDPS.2004.1303065

E. Sun, D. Schaa, R. Bagley, N. Rubin, and D. Kaeli, Enabling tasklevel scheduling on heterogeneous platforms, Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, pp.84-93

B. J. Svensson and R. Newton, Programming future parallel architectures with haskell and intel arbb, 2011.

B. J. Svensson and M. Sheeran, Parallel programming in Haskell almost for free, Proceedings of the 1st ACM SIGPLAN workshop on Functional high-performance computing, FHPC '12, pp.3-14
DOI : 10.1145/2364474.2364477

J. Svensson, M. Sheeran, and K. Claessen, Obsidian: A domain specific embedded language for parallel programming of graphics processors. Implementation and Application of Functional Languages, pp.156-173, 2011.

D. Tarditi, S. Puri, and J. Oglesby, Accelerator: using data parallelism to program gpus for general-purpose uses, ASPLOS-XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, pp.325-335, 2006.

H. Topcuoglu, S. Hariri, and M. Wu, Performance-effective and low-complexity task scheduling for heterogeneous computing. Parallel and Distributed Systems, IEEE Transactions on, vol.13, issue.3, pp.260-274, 2002.
DOI : 10.1109/71.993206

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.119.122

J. Windows, Automated parallelisation of code written in the birdmeertens formalism, 2003.