Distributed message passing with MPI4Py

Federico Tesser 1, 2
2 MEMPHIS - Modeling Enablers for Multi-PHysics and InteractionS
Inria Bordeaux - Sud-Ouest, IMB - Institut de Mathématiques de Bordeaux
Abstract : MPI4Py provides open source Python bindings to most of the functionality of MPI-1/2/3 specifications of the Message Passing Interface, considered the de facto standard for distributed memory programming. This tutorial will describe the core concepts of this set of functions, will provide descriptions of the point to point and collective communication methods for primitive and composite datatypes, demonstrating their use in code examples made in a pure Python environment. In a programming language where the GIL bosses around, MPI4Py boosts performances simply exposing the API of the MPI standard to the users without the need to " get their hands dirty working on bare metal " , and giving them the capability to use multiple processes to solve their problems. This tutorial will cover the following aspects, interleaving theoretical parts with practical programming examples (written, of course, in Python and sometimes compared to the corresponding C/C++ versions): introduction on distributed memory programming and brief history of MPI; differences between SPMD (Single Program, Multiple Data) and MPMD (Multiple program, Multiple Data) paradigms, although both of them are used to write MPI software; MPI core concepts: MPI processes, inter and intra communicators, message routing and buffering. These concepts will help in understanding and providing context for all of those abilities MPI provides; MPI and MPI4Py environment management routines. This group of routines is used for initializing, terminating, interrogating and setting the MPI execution environment, and covers an assortment of purposes such as querying a rank's identity, querying the MPI library's version, and so on (the famous Hello World! section :-)); point to point communications: they encompass all the methods MPI offers to transmit a message between a pair of processes. Blocking (buffered, synchronous and ready) and non blocking (immediate) communications will be presented; collective communications: the term collective communications refers to operations that involve more than two nodes, allowing them to take place concurrently. One-to-all, All-to-one and All-to-all groups will be explored; Differences between communications of generic Python Data Objects and Buffer Like Objects. MPI4Py can communicate any built-in or user-defined Python object, but this approach imposes important overheads in memory as well as processor usage, especially in the scenario of objects with large memory footprints being communicated. That's why MPI4Py supports also direct communication of any object allowing access to a contiguous memory buffer (specified by its address and extent) containing the relevant data, with negligible overhead. Examples of these datatypes are the NumPy array-objects; MPI Datatypes. Many algorithms require that the user specify the type of data which is sent between processors, which usually is different from the predefined MPI datatypes; conclusions, references, useful links.
Type de document :
Communication dans un congrès
Euroscipy 2016, Aug 2016, Erlangen, Germany
Liste complète des métadonnées

Contributeur : Federico Tesser <>
Soumis le : mercredi 30 novembre 2016 - 13:58:48
Dernière modification le : jeudi 11 janvier 2018 - 06:27:21


  • HAL Id : hal-01405507, version 1



Federico Tesser. Distributed message passing with MPI4Py. Euroscipy 2016, Aug 2016, Erlangen, Germany. 〈hal-01405507〉



Consultations de la notice