Application-Level Optimizations on NUMA Multicore Architectures: the Apache Case Study

Abstract : Multicore machines with Non-Uniform Memory Accesses (NUMA) are becoming commonplace. It is thus becoming crucial to understand how the resources they provide can be efficiently exploited. Most current research works are tackling the problem at the Operating System (OS) level. They focus on improving existing OS primitives, or on proposing novel OS designs with the aim of reducing OS bottlenecks and improving the scalability of applications running on such machines. In this paper, we adopt a complementary perspective: we examine how to optimize the scalability of a parallel application running on top of an unmodified, currently available operating system. The chosen application is the popular Apache-PHP stack. We highlight three performance issues at different levels of the system due to: (i) excessive remote memory accesses, (ii) inefficient load dispatching among cores, and (iii) contention on kernel data structures. We propose and implement solutions at the application-level for each issue. Our optimized Apache-PHP software stack achieves a 33% higher throughput than the base configuration on a 16-core setup. We conclude the paper with lessons learned on optimizing server applications for multicore computers
Type de document :
[Research Report] RR-LIG-011, 2011
Liste complète des métadonnées
Contributeur : Renaud Lachaize <>
Soumis le : dimanche 23 février 2014 - 21:51:34
Dernière modification le : jeudi 11 janvier 2018 - 06:22:03


  • HAL Id : hal-00950933, version 1



Fabien Gaud, Renaud Lachaize, Baptiste Lepers, Gilles Muller, Vivien Quéma. Application-Level Optimizations on NUMA Multicore Architectures: the Apache Case Study. [Research Report] RR-LIG-011, 2011. 〈hal-00950933〉



Consultations de la notice