8481 articles  [english version]

hal-00703130, version 1

Pipelining the Fast Multipole Method over a Runtime System

Emmanuel Agullo () 12, Bérenger Bramas 12, Olivier Coulaud (, http://www.labri.fr/~coulaud) 12, Eric Darve a34, Matthias Messner 12, Toru Takahashi b5

N° RR-7981 (2012)

Résumé : Fast Multipole Methods (FMM) are a fundamental operation for the simulation of many physical problems. The high performance design of such methods usually requires to carefully tune the algorithm for both the targeted physics and the hardware. In this paper, we propose a new approach that achieves high performance across architectures. Our method consists of expressing the FMM algorithm as a task flow and employing a state-of-the-art runtime system, StarPU, in order to process the tasks on the different processing units. We carefully design the task flow, the mathematical operators, their Central Processing Unit (CPU) and Graphics Processing Unit (GPU) implementations, as well as scheduling schemes. We compute potentials and forces of 200 million particles in 48.7 seconds on a homogeneous 160 cores SGI Altix UV 100 and of 38 million particles in 13.34 seconds on a heterogeneous 12 cores Intel Nehalem processor enhanced with 3 Nvidia M2090 Fermi GPUs.

  • a –  Stanford University
  • b –  Nagoya University
  • 1 :  HiePACS (INRIA Bordeaux - Sud-Ouest)
  • INRIA – Université de Bordeaux – CNRS : UMR5800 – CERFACS
  • 2 :  Laboratoire Bordelais de Recherche en Informatique (LaBRI)
  • CNRS : UMR5800 – Université Sciences et Technologies - Bordeaux I – École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB) – Université Victor Segalen - Bordeaux II
  • 3 :  Mechanical Engineering Department
  • Stanford University
  • 4 :  Institute for Computational and Mathematical Engineering (iCME)
  • Stanford University
  • 5 :  Department of Mechanical Science and Engineering
  • Nagoya University
  • Collaboration : PlaFRIM;FastLA
  • Domaine : Informatique/Calcul parallèle, distribué et partagé
  • Mots-clés : Fast multipole methods graphics processing unit – heterogeneous architectures – runtime system – pipeline – FMM
  • Référence interne : RR-7981
 
  • hal-00703130, version 1
  • oai:hal.inria.fr:hal-00703130
  • Contributeur : 
  • Soumis le : Vendredi 1 Juin 2012, 09:16:22
  • Dernière modification le : Lundi 15 Avril 2013, 09:16:37