R. B. Ross and R. Thakur, Pvfs: A parallel file system for linux clusters, Proceedings of the 4th annual Linux showcase and conference, pp.391-430, 2000.

S. Microsystems, LUSTRE file system -highperformance storage architecture and scalable cluster file system, 2007.

K. Bergman, S. Borkar, D. Campbell, W. Carlson, W. Dally et al., Exascale computing study: Technology challenges in achieving exascale systems, Defense Advanced Research Projects Agency Information Processing Techniques Office (DARPA IPTO), vol.15, 2008.

F. Z. Boito, A checkpoint of research on parallel I/O for highperformance computing, ACM Computing Surveys (CSUR), vol.51, issue.2, p.23, 2018.

S. Li, A flattened metadata service for distributed file systems, IEEE Transactions on Parallel and Distributed Systems, vol.29, issue.12, pp.2641-2657, 2018.

Y. Tsujita, Improving collective MPI-IO using topologyaware stepwise data aggregation with I/O throttling, Proceedings of the International Conference on High Performance Computing in AsiaPacific Region, pp.12-23, 2018.

M. Caron, Deep clustering for unsupervised learning of visual features, Proceedings of the European Conference on Computer Vision, pp.132-149, 2018.

A. Radford, L. Metz, and S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks, 2015.

M. Macdonald, Automated database workload characterization, mapping, and tuning through machine learning, 2018.

C. , D. Natale, and E. Martinelli, Data analysis, Breath Analysis, pp.81-94, 2019.
URL : https://hal.archives-ouvertes.fr/lirmm-01374569

R. Ross, R. Latham, W. Gropp, R. Thakur, and B. Toonen, Implementing mpi-io atomic mode without file system support, CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, vol.2, pp.1135-1142, 2005.

A. Nisar, W. Liao, and A. Choudhary, Scaling parallel i/o performance through i/o delegate and caching system, Proceedings of the 2008 ACM/IEEE conference on Supercomputing, p.9, 2008.

Y. Alforov, T. Ludwig, A. Novikova, M. Kuhn, and J. Kunkel, Towards green scientific data compression through high-level i/o interfaces, 2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp.209-216, 2018.

A. R. Carneiro, J. L. Bez, F. Z. Boito, B. A. Fagundes, C. Osthoff et al., Collective i/o performance on the santos dumont supercomputer, 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing, pp.45-52, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01711359

P. J. Pavan, Energy efficiency and I/O performance of low-power architectures, Concurrency and Computation: Practice and Experience, p.4948
URL : https://hal.archives-ouvertes.fr/hal-01784497

Q. Zoll, Y. Zhu, and D. Feng, A study of self-similarity in parallel I/O workloads, Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on, pp.1-6, 2010.

F. Wang, File system workload analysis for large scale scientific computing applications, Lawrence Livermore National Lab.(LLNL), 2004.

Y. Kim, Workload characterization of a leadership class storage cluster, 2010 5th Petascale Data Storage Workshop (PDSW '10), pp.1-5, 2010.

Y. Kim and R. Gunasekaran, Understanding i/o workload characteristics of a peta-scale storage system, The Journal of Supercomputing, vol.71, issue.3, pp.761-780, 2015.

M. Dorier, Calciom: Mitigating i/o interference in hpc systems through cross-application coordination, Parallel and Distributed Processing Symposium, pp.155-164, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00916091

H. Luu, M. Winslett, W. Gropp, R. Ross, P. Carns et al., A multiplatform study of I/O behavior on petascale supercomputers, Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, pp.33-44, 2015.

Y. Liu, R. Gunasekaran, X. Ma, and S. S. Vazhkudai, Server-side log data analytics for i/o workload characterization and coordination on large shared storage systems, SC'16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp.819-829, 2016.

B. Xie, J. Chase, D. Dillow, O. Drokin, S. Klasky et al., Characterizing output bottlenecks in a supercomputer, High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for, pp.1-11, 2012.

A. K. Mishra, Towards characterizing cloud backend workloads: insights from google compute clusters, ACM SIGMETRICS Performance Evaluation Review, vol.37, issue.4, pp.34-41, 2010.

S. Di, D. Kondo, and F. Cappello, Characterizing and modeling cloud applications/jobs on a google data center, The Journal of Supercomputing, vol.69, issue.1, pp.139-160, 2014.

M. C. Calzarossa, L. Massari, and D. Tessera, Workload characterization: A survey revisited, ACM Computing Surveys (CSUR), vol.48, issue.3, p.48, 2016.

P. Carns, R. Latham, R. Ross, K. Iskra, S. Lang et al., 24/7 characterization of petascale i/o workloads, 2009 IEEE International Conference on Cluster Computing and Workshops, pp.1-10, 2009.

M. Lawrence, Software for computing and annotating genomic ranges, PLOS Computational Biology, vol.9, issue.8, pp.1-10, 2013.

J. A. Hartigan and M. A. Wong, Algorithm as 136: A k-means clustering algorithm, Journal of the Royal Statistical Society. Series C (Applied Statistics), vol.28, issue.1, pp.100-108, 1979.

D. S. Wilks, Cluster analysis, International geophysics, vol.100, pp.603-616, 2011.

R. C. De-amorim and C. Hennig, Recovering the number of clusters in data sets with noise features using feature rescaling factors, Information Sciences, vol.324, pp.126-145, 2015.

P. J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis, Journal of computational and applied mathematics, vol.20, pp.53-65, 1987.

F. Pedregosa, Scikit-learn: Machine learning in Python, Journal of Machine Learning Research, vol.12, pp.2825-2830, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00650905