A. Umut, N. Acar, M. Ben-david, and . Rainey, Contention in structured concurrency: Provably efficient dynamic nonzero indicators for nested parallel computation, 2016.

A. Umut, A. Acar, M. Charguéraud, and . Rainey, Scheduling parallel programs by work stealing with private deques, PPoPP '13, 2013.

A. Umut, A. Acar, M. Charguéraud, F. Rainey, and . Sieczkowski, Dag-calculus: A calculus for parallel computation, Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, pp.18-32, 2016.

A. Agarwal and M. Cherian, Adaptive backoff synchronization techniques, Proceedings of the 16th Annual International Symposium on Computer Architecture, ISCA '89, pp.396-406, 1989.
DOI : 10.1109/isca.1989.714578

URL : http://www.dtic.mil/get-tr-doc/pdf?AD=ADA211940

H. James, Y. Anderson, and . Kim, An improved lower bound for the time complexity of mutual exclusion, Distrib. Comput, vol.15, issue.4, pp.221-253, 2002.

T. E. Anderson, The performance of spin lock alternatives for shared-money multiprocessors, IEEE Transactions on Parallel and Distributed Systems, vol.1, issue.1, pp.6-16, 1990.
DOI : 10.1109/71.80120

S. Nimar, R. D. Arora, C. G. Blumofe, and . Plaxton, Thread scheduling for multiprogrammed multiprocessors, Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures, SPAA '98, pp.119-129, 1998.

D. Robert, C. E. Blumofe, and . Leiserson, Scheduling multithreaded computations by work stealing, J. ACM, vol.46, pp.720-748, 1999.

T. Brown, F. Ellen, and E. Ruppert, A general technique for non-blocking trees, Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '14, pp.329-342, 2014.
DOI : 10.1145/2692916.2555267

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.693.2102

P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra et al., X10: an object-oriented approach to non-uniform cluster computing, Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming , systems, languages, and applications, OOPSLA '05, pp.519-538, 2005.

R. Cypher, The communication requirements of mutual exclusion, Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures , SPAA '95, pp.147-156, 1995.
DOI : 10.1145/215399.215434

C. Dwork, M. Herlihy, and O. Waarts, Contention in shared memory algorithms, Journal of the ACM, vol.44, issue.6, pp.779-805, 1997.
DOI : 10.1145/268999.269000

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.4313

F. Ellen, P. Fatourou, J. Helga, and E. Ruppert, The amortized complexity of non-blocking binary search trees, Proceedings of the 2014 ACM symposium on Principles of distributed computing, PODC '14, pp.332-340, 2014.
DOI : 10.1145/2611462.2611486

F. Ellen, Y. Lev, V. Luchangco, and M. Moir, SNZI, Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing, PODC '07, pp.13-22, 2007.
DOI : 10.1145/1281100.1281106

F. Fich and E. Ruppert, Hundreds of impossibility results for distributed computing, Distributed Computing, vol.16, issue.2-3, pp.121-163, 2003.
DOI : 10.1007/s00446-003-0091-y

F. E. Fich, D. Hendler, and N. Shavit, Linear Lower Bounds on Real-World Implementations of Concurrent Objects, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05), pp.165-173, 2005.
DOI : 10.1109/SFCS.2005.47

M. Fluet, M. Rainey, J. Reppy, and A. Shaw, Implicitly threaded parallelism in Manticore, Journal of Functional Programming, vol.20, pp.5-61, 2011.

M. Fomitchev and E. Ruppert, Lock-free linked lists and skip lists, Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing , PODC '04, pp.50-59, 2004.
DOI : 10.1145/1011767.1011776

M. Frigo, C. E. Leiserson, and K. H. Randall, The implementation of the Cilk-5 multithreaded language, PLDI, pp.212-223, 1998.

B. Phillip, Y. Gibbons, V. Matias, and . Ramachandran, Efficient low-contention parallel algorithms, Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures, pp.236-247, 1994.

B. Phillip, Y. Gibbons, V. Matias, and . Ramachandran, The queue-read queue-write pram model: Accounting for contention in parallel algorithms, SIAM Journal on Computing, pp.638-648, 1997.

J. R. Goodman, M. K. Vernon, and P. J. Woest, Efficient synchronization primitives for large-scale cache-coherent multiprocessors, Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS III, pp.64-75, 1989.
DOI : 10.1145/70082.68188

T. Harris and K. Fraser, Language support for lightweight transactions, Proceedings of the 18th Annual ACM SIG- PLAN Conference on Object-oriented Programing, Systems, Languages, and Applications, OOPSLA '03, pp.388-402, 2003.
DOI : 10.1145/2641638.2641654

M. Herlihy, Wait-free synchronization, ACM Transactions on Programming Languages and Systems, vol.13, issue.1, pp.124-149, 1991.
DOI : 10.1145/114005.102808

M. Herlihy, A methodology for implementing highly concurrent data objects, ACM Transactions on Programming Languages and Systems, vol.15, issue.5, pp.745-770, 1993.
DOI : 10.1145/161468.161469

M. Herlihy, V. Luchangco, M. Moir, W. N. Scherer, and I. , Software transactional memory for dynamic-sized data structures, Proceedings of the twenty-second annual symposium on Principles of distributed computing , PODC '03, pp.92-101, 2003.
DOI : 10.1145/872035.872048

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.222.1147

S. Mahmood-imam and V. Sarkay, Habanero-java library: a java 8 framework for multicore programming, 2014 International Conference on Principles and Practices of Programming on the Java Platform Virtual Machines, Languages and Tools, PPPJ '14, pp.75-86, 2014.

. Intel, Intel threading building blocks, 2011.

P. Jayanti, K. Tan, and S. Toueg, Time and Space Lower Bounds for Nonblocking Implementations, SIAM Journal on Computing, vol.30, issue.2, pp.438-456, 2000.
DOI : 10.1137/S0097539797317299

M. Richard, Y. Karp, and . Zhang, Randomized parallel algorithms for backtrack search and branch-and-bound computation, Journal of the ACM (JACM), vol.40, issue.3, pp.765-789, 1993.

G. Keller, M. M. Chakravarty, R. Leshchinskiy, S. P. Jones, and B. Lippmeier, Regular, shapepolymorphic , parallel arrays in haskell, Proceedings of the 15th ACM SIGPLAN international conference on Functional programming, pp.261-272, 2010.
DOI : 10.1145/1932681.1863582

A. Kogan and E. Petrank, A methodology for creating fast wait-free data structures, Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '12, pp.141-150, 2012.
DOI : 10.1145/2370036.2145835

D. Lea, A Java fork/join framework, Proceedings of the ACM 2000 conference on Java Grande , JAVA '00, pp.36-43, 2000.
DOI : 10.1145/337449.337465

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.42.1918

D. Leijen, W. Schulte, and S. Burckhardt, The design of a task parallel library, Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications, OOPSLA '09, pp.227-242, 2009.

P. Liu, W. Aiello, and S. Bhatt, An atomic model for message-passing, Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures , SPAA '93, pp.154-163, 1993.
DOI : 10.1145/165231.165251

M. Maged, M. L. Michael, and . Scott, Nonblocking algorithms and preemption-safe locking on multiprogrammed shared memory multiprocessors, J. Parallel Distrib. Comput, vol.51, issue.1, pp.1-26, 1998.

M. Moir and N. Shavit, Concurrent data structures. Handbook of Data Structures and Applications, pp.47-61, 2007.
DOI : 10.1201/9781420035179.ch47

R. Oshman and N. Shavit, The SkipTrie, Proceedings of the 2013 ACM symposium on Principles of distributed computing, PODC '13, pp.23-32, 2013.
DOI : 10.1145/2484239.2484270

W. N. Scherer and M. L. Scott, Advanced contention management for dynamic software transactional memory, Proceedings of the twenty-fourth annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing , PODC '05, pp.240-248, 2005.
DOI : 10.1145/1073814.1073861

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.509.1786

N. Shavit and D. Touitou, Software transactional memory, Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing, PODC '95, pp.204-213, 1995.

N. Shavit and A. Zemach, Combining funnels, Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing , PODC '98, pp.1355-1387, 2000.
DOI : 10.1145/277697.277707

J. Shun, G. E. Blelloch, J. T. Fineman, and P. B. Gibbons, Reducing contention through priority updates, Proceedings of the Twenty-fifth Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA '13, pp.152-163, 2013.
DOI : 10.1145/2442516.2442554

S. Timnat and E. Petrank, A practical wait-free simulation for lock-free data structures, Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '14, pp.357-368, 2014.

R. Kent and T. , Systems programming: Coping with parallelism, International Business Machines Incorporated, 1986.