Coupling Memory and Computation for Locality Management, Summit on Advances in Programming Languages (SNAPL), 2015. ,
The data locality of work stealing, Theory of Computing Systems (TOCS), vol.35, pp.321-347, 2002. ,
Scheduling Parallel Programs by Work Stealing with Private Deques, Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '13), 2013. ,
Oracleguided scheduling for controlling granularity in implicitly parallel languages, Journal of Functional Programming, vol.26, p.23, 2016. ,
Deadlock-free scheduling of X10 computations with bounded resources, SPAA 2007: Proceedings of the 19th Annual ACM Symposium on Parallelism in Algorithms and Architectures, pp.229-240, 2007. ,
Thread scheduling for multiprogrammed multiprocessors, Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures (SPAA '98), pp.119-129, 1998. ,
Thread Scheduling for Multiprogrammed Multiprocessors. Theory of Computing Systems, vol.34, pp.115-144, 2001. ,
Internally deterministic parallel algorithms can be fast, PPoPP '12, pp.181-192, 2012. ,
Scheduling irregular parallel computations on hierarchical caches, Proceedings of the 23rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '11, pp.355-366, 2011. ,
Effectively sharing a cache among threads, SPAA, 2004. ,
Provably efficient scheduling for languages with fine-grained parallelism, J. ACM, vol.46, pp.281-321, 1999. ,
Space-Efficient Scheduling of Multithreaded Computations, SIAM J. Comput, vol.27, pp.202-229, 1998. ,
Scheduling multithreaded computations by work stealing, J. ACM, vol.46, pp.720-748, 1999. ,
The parallel evaluation of general arithmetic expressions, J. ACM, vol.21, pp.201-206, 1974. ,
Executing functional programs on a virtual tree of processors. In Functional Programming Languages and Computer Architecture (FPCA '81), pp.187-194, 1981. ,
X10: an object-oriented approach to non-uniform cluster computing, Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications (OOPSLA '05), pp.519-538, 2005. ,
Dynamic circular work-stealing deque, SPAA '05, pp.21-28, 2005. ,
Cacheefficient dynamic programming algorithms for multicores, Proc. 20th ACM Symposium on Parallelism in Algorithms and Architectures, pp.207-216, 2008. ,
An adaptive cut-off for task parallelism, 2008 SC-International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-11, 2008. ,
Speedup versus efficiency in parallel systems, IEEE Transactions on Computing, vol.38, pp.408-423, 1989. ,
A Message Passing Implementation of Lazy Task Creation, Parallel Symbolic Computing, pp.94-107, 1992. ,
Polling efficiently on stock hardware, Proceedings of the conference on Functional programming languages and computer architecture (FPCA '93, pp.179-187, 1993. ,
Control Operators, the SECD-Machine, and the Lambda-Calculus, Formal Description of Programming Concepts-III, pp.193-219, 1987. ,
Implicitly threaded parallelism in Manticore, Journal of Functional Programming, vol.20, pp.1-40, 2011. ,
Implicitly-threaded parallelism in Manticore, ICFP, pp.119-130, 2008. ,
The Implementation of the Cilk-5 Multithreaded Language, PLDI, pp.212-223, 1998. ,
Enabling Primitives for Compiling Parallel Languages, Third Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers, 1995. ,
Lazy threads: Implementing a fast parallel call, J. Parallel and Distrib. Comput, vol.37, pp.5-20, 1996. ,
A Provably Time-efficient Parallel Implementation of Full Speculation, ACM Transactions on Programming Languages and Systems, vol.21, issue.2, pp.240-285, 1999. ,
Hierarchical Memory Management for Mutable State, ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP), 2018. ,
Implementation of Multilisp: Lisp on a Multiprocessor, Proceedings of the 1984 ACM Symposium on LISP and functional programming (LFP '84), pp.9-17, 1984. ,
Burroughs' B6500/B7500 Stack Mechanism, Spring Joint Computer Conference (AFIPS '68 (Spring), pp.245-251, 1968. ,
Backtracking-based load balancing, Proceedings of the 2009 ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming, vol.44, pp.55-64, 2009. ,
Using the run-time sizes of data structures to guide parallel-thread creation, Proceedings of the 1994 ACM conference on LISP and functional programming (LFP '94, pp.79-90, 1994. ,
Habanero-Java library: a Java 8 framework for multicore programming, 2014 International Conference on Principles and Practices of Programming on the Java Platform Virtual Machines, Languages and Tools, PPPJ '14, pp.75-86, 2014. ,
Intel Threading Building Blocks, 2011. ,
A static cut-off for task parallel programs, Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, pp.139-150, 2016. ,
A Java fork/join framework, Proceedings of the ACM 2000 conference on Java Grande (JAVA '00, pp.36-43, 2000. ,
On-the-Fly Pipeline Parallelism, TOPC, vol.2, pp.1-17, 2015. ,
Using Memory Mapping to Support Cactus Stacks in Work-stealing Runtime Systems, Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT '10), pp.411-420, 2010. ,
The design of a task parallel library, Proceedings of the 24th ACM SIGPLAN conference on Object Oriented Programming Systems Languages and Applications (OOPSLA '09, pp.227-242, 2009. ,
A methodology for granularity-based control of parallelism in logic programs, Journal of Symbolic Computation, vol.21, pp.715-734, 1996. ,
Parallel and Concurrent Programming in Haskell, 2013. ,
Lazy task creation: a technique for increasing the granularity of parallel programs, IEEE Transactions on Parallel and Distributed Systems, vol.2, pp.264-280, 1991. ,
Space-Efficient Scheduling of Nested Parallelism, ACM Transactions on Programming Languages and Systems, vol.21, 1999. ,
, OpenMP Architecture Review Board
, OpenMP Application Program Interface
Low-cost process creation and dynamic partitioning in Qlisp, Parallel Lisp: Languages and Systems, vol.441, pp.182-199, 1990. ,
Hierarchical Memory Management for Parallel Programs, ICFP 2016, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01416237
Flexible architectural support for fine-grain scheduling, Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems (ASPLOS '10), pp.311-322, 2010. ,
Brief Announcement: The Problem Based Benchmark Suite, Proceedings of the Twenty-fourth Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '12, pp.68-70, 2012. ,
MultiMLton: A multicore-aware runtime for standard ML, Journal of Functional Programming FirstView, pp.1-62, 2014. ,
Beyond Nested Parallelism: Tight Bounds on Workstealing Overheads for Parallel Futures, Proceedings of the Twentyfirst Annual Symposium on Parallelism in Algorithms and Architectures (SPAA '09), pp.91-100, 2009. ,
Lazy binary-splitting: a run-time adaptive work-stealing scheduler, Symposium on Principles & Practice of Parallel Programming, pp.179-190, 2010. ,
Lazy binary-splitting: a run-time adaptive work-stealing scheduler, PPoPP '10, pp.179-190, 2010. ,
Lazy Scheduling: A Runtime Adaptive Scheduler for Declarative Parallelism, TOPLAS, vol.36, issue.10, 2014. ,
A bridging model for parallel computation, CACM, vol.33, pp.103-111, 1990. ,
Parallel Execution of Lisp Programs, 1989. ,
A Practical Solution to the Cactus Stack Problem, Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '16, pp.61-70, 2016. ,