M. P. Forum, MPI-2: Extensions to the Message-Passing Interface, 1997.

R. Thakur, W. Gropp, and E. Lusk, A case for using MPIs derived datatypes to improve I/O performance, Proceedings of SC98: High Performance Networking and Computing, 1998.

H. Luu, M. Winslett, W. Gropp, R. Ross, P. Carns et al., A Multiplatform Study of I/O Behavior on Petascale Supercomputers, Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, HPDC '15, pp.33-44, 2015.
DOI : 10.1109/SC.2008.5222721

S. Byna, R. Sisneros, K. Chadalavada, and Q. Koziol, Tuning parallel I/O on Blue Waters for writing 10 trillion particles Cray User Group (CUG) meeting Available: https://sdm.lbl.gov/ sbyna, 2015.

M. S. Breitenfeld, K. Chadalavada, R. Sisneros, S. Byna, Q. Koziol et al., Recent progress in tuning performance of large-scale I/O with parallel HDF5, p.2014

H. Bui, H. Finkel, V. Vishwanath, S. Habib, K. Heitmann et al., Scalable Parallel I/O on a Blue Gene/Q Supercomputer Using Compression, Topology-Aware Data Aggregation, and Subfiling, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, pp.107-111, 2014.
DOI : 10.1109/PDP.2014.60

F. Schmuck and R. Haskin, GPFS: A shared-disk file system for large computing clusters Lustre filesystem website, Proceedings of the 1st USENIX Conference on File and Storage Technologies: USENIX Association, 2002.

M. Chaarawi, S. Chandok, and E. Gabriel, Performance Evaluation of Collective Write Algorithms in MPI I/O, pp.185-194, 2009.
DOI : 10.1007/978-3-642-01970-8_19

V. Venkatesan, R. Anand, J. Subhlok, and E. Gabriel, Optimized process placement for collective I/O operations, " in Proceedings of the 20th European MPI Users' Group Meeting, ser. EuroMPI '13, pp.31-36, 2013.

F. Isaila, P. Balaprakash, S. M. Wild, D. Kimpe, R. Latham et al., Collective I/O tuning using analytical and machinelearning models, IEEE Cluster 2015, p.9, 2015.
DOI : 10.1109/cluster.2015.29

J. M. Del-rosario, R. Bordawekar, and A. Choudhary, Improved parallel I/O via a two-phase run-time access strategy, ACM SIGARCH Computer Architecture News, vol.21, issue.5, pp.31-38, 1993.
DOI : 10.1145/165660.165667

W. Gropp, MPICH2: A new start for MPI implementations Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, Proceedings of the 9th European PVM/MPI Users, p.7, 2002.

R. Thakur, W. Gropp, and E. Lusk, Data sieving and collective I/O in ROMIO, " in Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation, ser. FRONTIERS '99, p.182, 1999.

Y. Tsujita, H. Muguruma, K. Yoshinaga, A. Hori, M. Namiki et al., Improving collective I/O performance using pipelined two-phase I/O, Proceedings of the 2012 Symposium on High Performance Computing, ser. HPC '12, pp.1-7, 2012.

Y. Tsujita, K. Yoshinaga, A. Hori, M. Sato, M. Namiki et al., Multithreaded Two-Phase I/O: Improving Collective MPI-IO Performance on a Lustre File System, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, pp.232-235, 2014.
DOI : 10.1109/PDP.2014.46

M. Chaarawi and E. Gabriel, Automatically Selecting the Number of Aggregators for Collective I/O Operations, 2011 IEEE International Conference on Cluster Computing, pp.428-437, 2011.
DOI : 10.1109/CLUSTER.2011.79

R. Filgueira, D. E. Singh, J. C. Pichel, F. Isaila, and J. Carretero, Data locality aware strategy for two-phase collective I/O, " in High Performance Computing for Computational Science -VECPAR Revised Selected Papers, 8th International Conference, pp.137-149, 2008.

H. Bui, J. Leigh, E. Jung, V. Vishwanath, and M. E. Papka, Improving Data Movement Performance for Sparse Data Patterns on the Blue Gene/Q Supercomputer, 2014 43rd International Conference on Parallel Processing Workshops, pp.302-311, 2014.
DOI : 10.1109/ICPPW.2014.47

V. Vishwanath, M. Hereld, V. Morozov, and M. E. Papka, Topologyaware data movement and staging for I/O acceleration on Blue Gene/P supercomputing systems Storage and Analysis, ser. SC '11, Proceedings of 2011 International Conference for High Performance Computing, Networking, pp.1-19, 2011.
DOI : 10.1145/2063384.2063409

F. Tessier, P. Malakar, V. Vishwanath, E. Jeannot, and F. Isaila, Topology-aware data aggregation for intensive I/O on large-scale supercomputers Available: https, Proceedings of the First Workshop on Optimization of Communication in HPC, ser. COM-HPC '16, pp.73-81, 2016.
DOI : 10.1109/comhpc.2016.013

M. Gilge, IBM system blue gene solution -blue gene/Q application development, IBM Redbooks, 2014.

P. Schwan, Lustre: Building a file system for 1,000-node clusters, PROCEEDINGS OF THE LINUX SYMPOSIUM, p.9, 2003.