U. A. Acar and G. Blelloch, 15210: Algorithms: Parallel and sequential, Date accessed, 2015.

U. A. Acar and G. Blelloch, Algorithm design: Parallel and sequential. http:www. parallel-algorithms-book.com. Date accessed, 2015.

U. A. Acar, . Blelloch, E. Guy, and R. D. Blumofe, The data locality of work stealing. Theory of computing systems, pp.321-347, 2002.

U. A. Acar, . Charguéraud, . Arthur, and M. Rainey, Oracle scheduling: Controlling granularity in implicitly parallel languages, Pages 499?518 of: ACM SIGPLAN conference on object-oriented programming, systems, languages, and applications (OOPSLA), 2011.
URL : https://hal.archives-ouvertes.fr/hal-01409069

U. A. Acar, . Charguéraud, . Arthur, and M. Rainey, Scheduling parallel programs by work stealing with private deques, 2013.
DOI : 10.1145/2517327.2442538

URL : https://hal.archives-ouvertes.fr/hal-00863028

U. A. Acar, . Chargueraud, . Arthur, and M. Rainey, An introduction to parallel computing in c++, 2015.

U. A. Acar, . Chargueraud, . Arthur, and M. Rainey, A work-efficient algorithm for parallel unordered depth-first search. Page 1 of: Acm/ieee conference on high performance computing (sc), 2015.
DOI : 10.1145/2807591.2807651

URL : https://hal.archives-ouvertes.fr/hal-01245837

. Aharoni, . Gad, . Feitelson, G. Dror, and A. Barak, Abstract, Journal of Functional Programming, vol.12, issue.04, pp.387-405, 1992.
DOI : 10.1145/800055.802033

N. S. Arora, R. D. Blumofe, C. Plaxton, and . Greg, Thread scheduling for multiprogrammed multiprocessors, Pages 119?129 of: Proceedings of the tenth annual acm symposium on parallel algorithms and architectures. SPAA '98, 1998.
DOI : 10.1007/s00224-001-0004-z

N. S. Arora, R. D. Blumofe, C. Plaxton, and . Greg, Thread scheduling for multiprogrammed multiprocessors. Theory of computing systems, pp.115-144, 2001.
DOI : 10.1007/s00224-001-0004-z

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.130.3853

J. Barnes and P. Hut, A hierarchical O(N log N) force-calculation algorithm, Nature, vol.6, issue.6096, pp.324-446, 1986.
DOI : 10.1038/324446a0

L. Bergstrom, . Fluet, . Matthew, . Rainey, . Mike et al., Lazy tree splitting, 2010.
DOI : 10.1145/1932681.1863558

G. Blelloch and J. Greiner, Parallelism in sequential functional languages, Proceedings of the seventh international conference on Functional programming languages and computer architecture , FPCA '95, 1995.
DOI : 10.1145/224164.224210

G. E. Blelloch and P. B. Gibbons, Effectively sharing a cache among threads, Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures , SPAA '04, 2004.
DOI : 10.1145/1007912.1007948

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.103.2165

G. E. Blelloch and J. Greiner, A provable time and space efficient implementation of NESL, Pages 213?225 of: Proceedings of the 1st acm sigplan international conference on functional programming, 1996.

G. E. Blelloch, . Sabot, and W. Gary, Compiling collection-oriented languages onto massively parallel computers, Journal of Parallel and Distributed Computing, vol.8, issue.2, pp.119-134, 1990.
DOI : 10.1016/0743-7315(90)90087-6

G. E. Blelloch, J. C. Hardwick, . Sipelstein, . Jay, . Zagha et al., Implementation of a Portable Nested Data-Parallel Language, Journal of Parallel and Distributed Computing, vol.21, issue.1, pp.4-14, 1994.
DOI : 10.1006/jpdc.1994.1038

G. E. Blelloch, J. T. Fineman, P. B. Gibbons, H. Simhadri, and . Vardhan, Scheduling irregular parallel computations on hierarchical caches, Proceedings of the 23rd ACM symposium on Parallelism in algorithms and architectures, SPAA '11, 2011.
DOI : 10.1145/1989493.1989553

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.206.2648

R. D. Blumofe and C. E. Leiserson, Scheduling multithreaded computations by work stealing, J. acm, pp.46-720, 1999.
DOI : 10.1109/sfcs.1994.365680

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.113.7695

R. P. Brent, The Parallel Evaluation of General Arithmetic Expressions, Journal of the ACM, vol.21, issue.2, pp.201-206, 1974.
DOI : 10.1145/321812.321815

M. M. Chakravarty, . Leshchinskiy, P. Roman, . Jones, . Simon et al., Data parallel Haskell, Proceedings of the 2007 workshop on Declarative aspects of multicore architectures , DAMP '07, p.7, 2007.
DOI : 10.1145/1248648.1248652

R. A. Chowdhury, F. Silvestri, B. Blakeley, and V. Ramachandran, Oblivious algorithms for multicores and network of processors, 2010.

R. Cole, . Ramachandran, and . Vijaya, Resource oblivious sorting on multicores. Pages 226?237 of: Proceedings of the 37th international colloquium conference on automata, languages and programming. ICALP'10, 2010.
DOI : 10.1007/978-3-642-14165-2_20

URL : http://arxiv.org/abs/1508.01504

K. Crary and S. Weirich, Resource bound certification, Proceedings of the 27th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '00, 0198.
DOI : 10.1145/325694.325716

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.17.5985

M. Feeley, A message passing implementation of lazy task creation. Pages 94?107 of: Parallel symbolic computing, 1992.

M. Feeley, An efficient and general implementation of futures on large scale sharedmemory multiprocessors, pp.93-22348, 1993.

M. Fluet, M. Rainey, and J. Reppy, A scheduling framework for generalpurpose parallel languages, Pages 241?252 of: Acm sigplan international conference on functional programming (icfp, 2008.
DOI : 10.1145/1411203.1411239

M. Fluet, . Rainey, . Mike, . Reppy, . John et al., Implicitly threaded parallelism in Manticore, Journal of functional programming, vol.20, pp.5-6, 2011.

J. D. Frens, . Wise, and S. David, Auto-blocking matrix-multiplication or tracking blas3 performance from source code, Pages 206?216 of: Proceedings of the sixth acm sigplan ZU064-05-FPR main 5, p.9, 1997.
DOI : 10.1145/263764.263789

C. Acar, Rainey symposium on principles and practice of parallel programming. PPOPP '97

M. Frigo, C. E. Leiserson, and K. H. Randall, The implementation of the Cilk-5 multithreaded language, 1998.

S. F. Goldsmith, A. S. Aiken, and D. S. Wilkerson, Measuring empirical computational complexity, Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering , ESEC-FSE '07, 2007.
DOI : 10.1145/1287624.1287681

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.142.2610

. Gulwani, . Sumit, K. K. Mehra, . Chilimbi, and . Trishul, Speed: precise and efficient static estimation of program computational complexity, Pages 127?139 of: Proceedings of the 36th annual acm sigplan-sigact symposium on principles of programming languages, 2009.

R. H. Halstead, MULTILISP: a language for concurrent symbolic computation, ACM Transactions on Programming Languages and Systems, vol.7, issue.4, pp.501-538, 1985.
DOI : 10.1145/4472.4478

. Hiraishi, . Tasuku, . Yasugi, . Masahiro, . Umatani et al., Backtracking-based load balancing. Pages 55?64 of: Ppopp '09, 2009.
DOI : 10.1145/1504176.1504187

. Huelsbergen, . Lorenz, J. R. Larus, and A. Aiken, Using the run-time sizes of data structures to guide parallel-thread creation, Pages 79?90 of: Proceedings of the 1994 acm conference on lisp and functional programming. LFP '94, 1994.

S. Jost, . Hammond, . Kevin, . Loidl, . Hans-wolfgang et al., Static determination of quantitative resource usage for higher-order programs, 2010.

X. Leroy, . Doligez, . Damien, . Garrigue, . Jacques et al., The Objective Caml system, 2005.

P. Lopez, M. Hermenegildo, and S. Debray, A Methodology for Granularity-Based Control of Parallelism in Logic Programs, Journal of Symbolic Computation, vol.21, issue.4-6, pp.715-734, 1996.
DOI : 10.1006/jsco.1996.0038

E. Mohr, D. A. Kranz, . Halstead-jr, and H. Robert, Lazy task creation: a technique for increasing the granularity of parallel programs. Pages 185?197 of: Conference record of the 1990 ACM conference on Lisp and functional programming, 1990.

G. Narlikar and . Jayant, Space-efficient scheduling for parallel, multithreaded computations, 1999.

J. Pehoushek and J. Weening, Low-cost process creation and dynamic partitioning in Qlisp Parallel lisp: Languages and systems, Pages Lecture Notes in Computer Science, vol.182, issue.441, 0199.
DOI : 10.1007/bfb0024155

P. Jones and L. Simon, Harnessing the multicores: Nested data parallelism in Haskell, 2008.

P. Jones, S. L. Leshchinskiy, . Roman, . Keller, . Gabriele et al., Harnessing the multicores: Nested data parallelism in Haskell, 2008.

H. C. Plummer, On the problem of distribution in globular star clusters. Monthly notices of the royal astronomical society, pp.71-460, 1911.

M. Rainey, Effective scheduling techniques for high-level parallel programming languages, 2010.

M. Rosendahl, Automatic complexity analysis. Pages 144?156 of: Fpca '89: Functional programming languages and computer architecture, 1989.
DOI : 10.1145/99370.99381

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.465.4828

D. Sanchez, R. M. Yoo, and C. Kozyrakis, Flexible architectural support for fine-grain scheduling. Pages 311?322 of: Proceedings of the fifteenth edition of asplos on architectural support for programming languages and operating systems. ASPLOS '10, 2010.
DOI : 10.1145/1735970.1736055

D. Sands, Calculi for time analysis of functional programs, 1990.

K. C. Sivaramakrishnan, . Ziarek, . Lukasz, . Jagannathan, and . Suresh, MultiMLton: A multicore-aware runtime for standard ML, Journal of Functional Programming, vol.2000, issue.06, pp.1-62, 2014.
DOI : 10.1145/324133.324234

D. Spoonhower, Scheduling deterministic parallel programs, 2009.

D. Spoonhower, . Blelloch, E. Guy, R. Harper, and P. B. Gibbons, Space profiling for parallel functional programs, International conference on functional programming, 2008.
DOI : 10.1145/1411204.1411240

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.141.5516

A. Tzannes, G. C. Caragea, . Vishkin, . Uzi, and R. Barua, Lazy Scheduling, ACM Transactions on Programming Languages and Systems, vol.36, issue.3, pp.1-1051, 2014.
DOI : 10.1145/2629643

L. G. Valiant, A bridging model for parallel computation, Communications of the ACM, vol.33, issue.8, pp.33-103, 1990.
DOI : 10.1145/79173.79181

J. S. Weening, Parallel execution of lisp programs, 1989.