P. Ali, K. Carns, D. Iskra, S. Kimpe, R. Lang et al., Scalable I/O forwarding framework for high-performance computing systems, 2009 IEEE International Conference on Cluster Computing and Workshops, pp.1-10, 2009.
DOI : 10.1109/CLUSTR.2009.5289188
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.151.2979

R. Bolze, F. Cappello, E. Caron, M. Daydé, F. Desprez et al., Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed, International Journal of High Performance Computing Applications, vol.20, issue.4, p.481, 2006.
DOI : 10.1177/1094342006070078
URL : https://hal.archives-ouvertes.fr/hal-00684943

G. H. Bryan and J. M. Fritsch, A Benchmark Simulation for Moist Nonhydrostatic Numerical Models, Monthly Weather Review, vol.130, issue.12, pp.2917-2928, 2002.
DOI : 10.1175/1520-0493(2002)130<2917:ABSFMN>2.0.CO;2

P. H. Carns, W. B. Ligon, I. , R. B. Ross, and R. Thakur, PVFS: a parallel file system for linux clusters, Proceedings of the 4th annual Linux Showcase & Conference, 2000.

C. Chilan, M. Yang, A. Cheng, and L. Arber, Parallel I/O performance study with HDF5, a scientific data package, 2006.

P. Dickens and J. Logan, Towards a high performance implementation of MPI-I/O on the Lustre file system. On the Move to Meaningful Internet Systems OTM, 2008.

P. Dickens and R. Thakur, Evaluation of Collective I/O Implementations on Parallel Architectures, Journal of Parallel and Distributed Computing, vol.61, issue.8, pp.611052-1076, 2001.
DOI : 10.1006/jpdc.2000.1733

C. Docan, M. Parashar, and S. Klasky, Enabling high-speed asynchronous data extraction and transfer using DART. Concurrency and Computation: Practice and Experience, pp.1181-1204, 2010.
DOI : 10.1002/cpe.1567
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.389.9676

S. Donovan, G. Huizenga, A. J. Hutton, C. C. Ross, M. K. Petersen et al., Lustre: Building a file system for 1000-node clusters, 2003.

F. Isaila, J. G. Blas, J. Carretero, R. Latham, and R. Ross, Design and Evaluation of Multiple-Level Data Staging for Blue Gene Systems, IEEE Transactions on Parallel and Distributed Systems, vol.22, issue.6, p.99, 2010.
DOI : 10.1109/TPDS.2010.127

J. Li, W. Liao, A. Choudhary, R. Ross, R. Thakur et al., Parallel netCDF, Proceedings of the 2003 ACM/IEEE conference on Supercomputing, SC '03, p.39, 2006.
DOI : 10.1145/1048935.1050189

M. Li, S. Vazhkudai, A. Butt, F. Meng, X. Ma et al., Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-12, 2010.
DOI : 10.1109/SC.2010.28

J. Lofstead, F. Zheng, Q. Liu, S. Klasky, R. Oldfield et al., Managing Variability in the IO Performance of Petascale Storage Systems, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-12, 2010.
DOI : 10.1109/SC.2010.32

J. F. Lofstead, S. Klasky, K. Schwan, N. Podhorszki, and C. Jin, Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS), Proceedings of the 6th international workshop on Challenges of large applications in distributed environments, CLADE '08, pp.15-24, 2008.
DOI : 10.1145/1383529.1383533

X. Ma, J. Lee, and M. Winslett, High-level buffering for hiding periodic output cost in scientific simulations, IEEE Transactions on Parallel and Distributed Systems, vol.17, issue.3, pp.193-204, 2006.
DOI : 10.1109/TPDS.2006.36

S. Moore, Multicore is bad news for supercomputers, IEEE Spectrum, vol.45, issue.11, p.4515, 2008.
DOI : 10.1109/MSPEC.2008.4659375

A. Nisar, W. Liao, and A. Choudhary, Scaling parallel I/O performance through I/O delegate and caching system, 2008 SC, International Conference for High Performance Computing, Networking, Storage and Analysis, 2008.
DOI : 10.1109/SC.2008.5214358
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.369.8862

C. M. Patrick, S. Son, and M. Kandemir, Comparative evaluation of overlap strategies with study of I/O overlap in MPI-IO, ACM SIGOPS Operating Systems Review, vol.42, issue.6, pp.43-49, 2008.
DOI : 10.1145/1453775.1453784

J. Prost, R. Treumann, R. Hedges, B. Jia, and A. Koniges, MPI-IO/GPFS, an optimized implementation of MPI-IO on top of GPFS, Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM) , Supercomputing '01, 2001.
DOI : 10.1145/582034.582051

J. C. Sancho, K. J. Barker, D. J. Kerbyson, and K. Davis, Quantifying the Potential Benefit of Overlapping Communication and Computation in Large-Scale Scientific Applications, ACM/IEEE SC 2006 Conference (SC'06), p.17, 2006.
DOI : 10.1109/SC.2006.51

F. Schmuck and R. Haskin, GPFS A shared-disk file system for large computing clusters, Proceedings of the First USENIX Conference on File and Storage Technologies. Citeseer, 2002.

H. Shan and J. Shalf, Using IOR to Analyze the I/O performance for HPC Platforms, Cray User Group Conference, 2007.

D. Skinner and W. Kramer, Understanding the causes of performance variability in HPC workloads, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005., pp.137-149, 2005.
DOI : 10.1109/IISWC.2005.1526010

R. Thakur, W. Gropp, and E. Lusk, Data Sieving and Collective I, ROMIO. Symposium on the Frontiers of Massively Parallel Processing, p.182, 1999.
DOI : 10.1109/fmpc.1999.750599
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.103.2994

A. Uselton, M. Howison, N. Wright, D. Skinner, N. Keen et al., Parallel I/O performance: From events to ensembles, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp.1-11, 2010.
DOI : 10.1109/IPDPS.2010.5470424

F. Zheng, H. Abbasi, C. Docan, J. Lofstead, Q. Liu et al., PreDatA – preparatory data analytics on peta-scale machines, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2010.
DOI : 10.1109/IPDPS.2010.5470454