J. L. Abellán, J. Fernández, and M. E. Acacio, GLocks: efficient Support for Highly- Contended Locks in Many-Core CMPs, Proceedings of the 2011 IEEE International Parallel and Distributed Processing Symposium (IPDPS '11, pp.893-905, 2011.

G. M. Amdahl, Validity of the single processor approach to achieving large scale computing capabilities, Proceedings of the April 18-20, 1967, spring joint computer conference on, AFIPS '67 (Spring), pp.483-485, 1967.
DOI : 10.1145/1465482.1465560

T. E. Anderson, The performance of spin lock alternatives for shared-money multiprocessors, IEEE Transactions on Parallel and Distributed Systems, vol.1, issue.1, pp.6-16, 1990.
DOI : 10.1109/71.80120

M. Auslander, D. Edelsohn, O. Krieger, B. Rosenburg, and R. Wisniewski, Enhancement to the MCS lock for increased functionality and improved programmability. U.S. patent application 10, p.745, 2003.

A. Baumann, P. Barham, P. Dagand, T. Harris, R. Isaacs et al., The multikernel, Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, SOSP '09, pp.29-44, 2009.
DOI : 10.1145/1629575.1629579

S. Boyd-wickizer, H. Chen, R. Chen, Y. Mao, F. Kaashoek et al., Corey: an Operating System for Many Cores, Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI '08). USENIX Association, pp.43-57, 2008.

A. T. Boyd-wickizer, Y. Clements, A. Mao, M. F. Pesterev, R. Kaashoek et al., An Analysis of Linux Scalability to Many Cores, Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI '10). USENIX Association, 2010.

B. B. Brandenburg, Improved analysis and evaluation of real-time semaphore protocols for P-FP scheduling, 2013 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS), pp.141-152, 2013.
DOI : 10.1109/RTAS.2013.6531087

A. Brodsky, F. Ellen, and P. Woelfel, Fully-adaptive Algorithms for Long-lived Renaming, Proceedings of the 20th International Conference on Distributed Computing (DISC '06, pp.413-427, 2006.

M. Chabbi, M. Fagan, and J. Mellor-crummey, High Performance Locks for Multilevel NUMA Systems, Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp.215-226, 2015.

T. S. Craig, Building FIFO and priority-queueing spin locks from atomic swap, 2003.

D. Interactive, Memcached: distributed memory object caching system, 2003.

T. David, R. Guerraoui, and V. Trigonakis, Everything you always wanted to know about synchronization but were afraid to ask, Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP '13, pp.33-48, 2013.
DOI : 10.1145/2517349.2522714

J. Dean and S. Ghemawat, MapReduce, Communications of the ACM, vol.51, issue.1, pp.107-113, 2008.
DOI : 10.1145/1327452.1327492

D. Dice, J. Virendra, N. Marathe, and . Shavit, Flat-combining NUMA locks, Proceedings of the 23rd ACM symposium on Parallelism in algorithms and architectures, SPAA '11, 2011.
DOI : 10.1145/1989493.1989502

D. Dice, J. Virendra, N. Marathe, and . Shavit, Lock Cohorting: a General Technique for Designing NUMA Locks, Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '12, pp.247-256, 2012.

D. Dice, J. Virendra, N. Marathe, and . Shavit, Lock Cohorting: A General Technique for Designing NUMA Locks, ACM Trans. Parallel Comput, vol.1, issue.42, 2015.

W. Edsger and . Dijkstra, Cooperating sequential processes, 1965.

J. Eastep, D. Wingate, M. D. Santambrogio, and A. Agarwal, Smartlocks, Proceeding of the 7th international conference on Autonomic computing, ICAC '10, pp.215-224, 2010.
DOI : 10.1145/1809049.1809079

P. Fatourou and N. D. Kallimanis, Sim: a Highly-Efficient Wait-Free Universal Construction, 2011.

P. Fatourou and N. D. Kallimanis, Revisiting the Combining Synchronization Technique, Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '12, pp.257-266, 2012.

B. Fitzpatrick, Distributed Caching with Memcached, Linux Journal, vol.124, p.5, 2004.

. Fowler, Refactoring: Improving the Design of Existing Code, 1999.
DOI : 10.1007/3-540-45672-4_31

A. Hassan, R. Palmieri, and B. Ravindran, Remote Invalidation: Optimizing the Critical Path of Memory Transactions, 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014.
DOI : 10.1109/IPDPS.2014.30

W. N. Bijun-he, I. Scherer, and M. L. Scott, Preemption Adaptivity in Time-Published Queue-Based Spin Locks, Proceedings of the 11th International Conference on High Performance Computing (HiPC'05, pp.7-18, 2005.

W. N. Bijun-he, I. Scherer, and M. L. Scott, Time-Published Queue-Based Spin Locks, 2005.

D. Hendler, I. Incze, N. Shavit, and M. Tzafrir, Flat combining and the synchronization-parallelism tradeoff, Proceedings of the 22nd ACM symposium on Parallelism in algorithms and architectures, SPAA '10, pp.355-364, 2010.
DOI : 10.1145/1810479.1810540

M. Herlihy and N. Shavit, The art of multiprocessor programming, Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing , PODC '06, 2008.
DOI : 10.1145/1146381.1146382

C. A. Hoare, Monitors: an operating system structuring concept, Communications of the ACM, vol.17, issue.10, pp.549-557, 1974.
DOI : 10.1145/355620.361161

D. Koufaty, D. Reddy, and S. Hahn, Bias scheduling in heterogeneous multi-core architectures, Proceedings of the 5th European conference on Computer systems, EuroSys '10, pp.125-138, 2010.
DOI : 10.1145/1755913.1755928

T. Scott, D. Leutenegger, and . Dias, A Modeling Study of the TPC-C Benchmark, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD '93, pp.22-31, 1993.

J. Lozi, F. David, G. Thomas, J. Lawall, and G. Muller, Remote Core Locking: migrating Critical-section Execution to Improve the Performance of Multithreaded Applications, Proceedings of the 2012 USENIX Annual Technical Conference (USENIX ATC '12). USENIX Association, pp.65-76, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00991709

V. Luchangco, D. Nussbaum, and N. Shavit, A Hierarchical CLH Queue Lock, Proceedings of the 12th International Conference on Parallel Processing (Euro-Par'06, pp.801-810, 2006.
DOI : 10.1007/11823285_84

P. Magnussen, A. Landin, and E. Hagersten, Queue locks on cache coherent multiprocessors, Proceedings of 8th International Parallel Processing Symposium, pp.165-171, 1994.
DOI : 10.1109/IPPS.1994.288305

M. John, M. L. Mellor-crummey, and . Scott, Algorithms for Scalable Synchronization on Sharedmemory Multiprocessors, ACM Transactions on Computer Systems, vol.9, issue.1, pp.21-65, 1991.

M. John, M. L. Mellor-crummey, and . Scott, Synchronization Without Contention, Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS IV, pp.269-278, 1991.

D. Berkeley, Oracle Corporation, 2004.

K. John and . Ousterhout, Scheduling techniques for concurrent systems, Proceedings of the 3rd International Conference on Distributed Computing Systems (ICDCS '82, pp.22-30, 1982.

Y. Oyama, K. Taura, and A. Yonezawa, Executing Parallel Programs With Synchronization Bottlenecks Efficiently, Proceedings of the International Workshop on Parallel and Distributed Computing for Symbolic and Irregular Applications (PDSIA '99), 1999.

Y. Padioleau, J. Lawall, R. R. Hansen, and G. Muller, Documenting and Automating Collateral Evolutions in Linux Device Drivers, Proceedings of the 3rd European Conference on Computer Systems, pp.247-260, 2008.
URL : https://hal.archives-ouvertes.fr/inria-00123142

D. Petrovi´cpetrovi´c, T. Ropars, and A. Schiper, Leveraging Hardware Message Passing for Efficient Thread Synchronization, Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '14, pp.143-154, 2014.

K. Kumar-pusukuri, R. Gupta, and L. Narayan-bhuyan, Lock Contention Aware Thread Migrations, Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '14, pp.369-370, 2014.

Z. Radovic and E. Hagersten, Hierarchical backoff locks for nonuniform communication architectures, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings., pp.241-253, 2003.
DOI : 10.1109/HPCA.2003.1183542

C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis, Evaluating MapReduce for Multi-core and Multiprocessor Systems, 2007 IEEE 13th International Symposium on High Performance Computer Architecture, pp.13-24, 2007.
DOI : 10.1109/HPCA.2007.346181

P. David, R. K. Reed, and . Kanodia, Synchronization with Eventcounts and Sequencers, Commun . ACM, vol.22, issue.2, pp.115-123, 1979.

W. Singh, A. Weber, and . Gupta, SPLASH, ACM SIGARCH Computer Architecture News, vol.20, issue.1, pp.5-44, 1992.
DOI : 10.1145/130823.130824

S. University, The Phoenix System for MapReduce Programming, 2011.

J. Talbot, R. M. Yoo, and C. Kozyrakis, Phoenix++, Proceedings of the second international workshop on MapReduce and its applications, MapReduce '11, pp.9-16, 2011.
DOI : 10.1145/1996092.1996095

M. Steven-cameron-woo, E. Ohara, . Torrie, A. Singh, and . Gupta, The SPLASH-2 Programs: characterization and Methodological Considerations, Proceedings of the 22nd Annual International Symposium on Computer Architecture (ISCA '95, pp.24-36, 1995.

W. Xiong, S. Park, J. Zhang, Y. Zhou, and Z. Ma, Ad Hoc Synchronization Considered Harmful, Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation (OSDI '10). USENIX Association, pp.1-8, 2010.

R. M. Yoo, A. Romano, and C. Kozyrakis, Phoenix rebirth: Scalable MapReduce on a large-scale shared-memory system, 2009 IEEE International Symposium on Workload Characterization (IISWC), pp.198-207, 2009.
DOI : 10.1109/IISWC.2009.5306783

K. Yotov, K. Pingali, and P. Stodghill, Automatic Measurement of Memory Hierarchy Parameters, Proceedings of the 2005 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '05, pp.181-192, 2005.