S. M. Williams and J. H. Moore, Big Data analysis on autopilot?, BioData Mining, vol.6, issue.1, p.22, 2013.
DOI : 10.1186/1756-0381-6-22

R. Overbeek, M. Fonstein, M. D. Souza, G. D. Pusch, and N. Maltsev, The use of gene clusters to infer functional coupling, Proceedings of the National Academy of Sciences, vol.96, issue.6, pp.2896-901, 1999.
DOI : 10.1073/pnas.96.6.2896

A. J. Enright, S. Van-dongen, and C. A. Ouzounis, An efficient algorithm for large-scale detection of protein families, Nucleic Acids Research, vol.30, issue.7, pp.1575-84, 2002.
DOI : 10.1093/nar/30.7.1575

A. Sarkar, H. Soueidan, and M. Nikolski, Identification of conserved gene clusters in multiple genomes based on synteny and homology, BMC Bioinformatics, vol.12, issue.Suppl 9, p.18, 2011.
DOI : 10.1101/gr.1224503

URL : https://hal.archives-ouvertes.fr/hal-00737346

V. Miele, S. Penel, and L. Duret, Ultra-fast sequence clustering from similarity networks with SiLiX, BMC Bioinformatics, vol.12, issue.1, p.116, 2011.
DOI : 10.1186/1471-2105-8-396

URL : https://hal.archives-ouvertes.fr/hal-00698365

R. Röttger, P. Kalaghatgi, P. Sun, S. D. Soares, V. Azevedo et al., Density parameter estimation for finding clusters of homologous proteins--tracing actinobacterial pathogenicity lifestyles, Bioinformatics, vol.29, issue.2, pp.215-237, 2013.
DOI : 10.1093/bioinformatics/bts653

D. E. Fouts, L. Brinkac, E. Beck, J. Inman, and G. Sutton, PanOCT: automated clustering of orthologs using conserved gene neighborhood for pan-genomic analysis of bacterial strains and closely related species, Nucleic Acids Research, vol.40, issue.22, p.172, 2012.
DOI : 10.1093/nar/gks757

J. Bonet, J. Planas-iglesias, J. Garcia-garcia, M. Marín-lópez, N. Fernandez-fuentes et al., ArchDB 2014: structural classification of loops in proteins, Nucleic Acids Research, vol.42, issue.D1, pp.315-324, 2014.
DOI : 10.1093/nar/gkt1189

S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, Basic local alignment search tool, Journal of Molecular Biology, vol.215, issue.3, pp.403-413, 1990.
DOI : 10.1016/S0022-2836(05)80360-2

S. Freilich, L. Goldovsky, A. Gottlieb, E. Blanc, S. Tsoka et al., Stratification of co-evolving genomic groups using ranked phylogenetic profiles, BMC Bioinformatics, vol.10, issue.1, p.355, 2009.
DOI : 10.1186/1471-2105-10-355

F. E. Psomopoulos, P. A. Mitkas, and C. A. Ouzounis, Detection of Genomic Idiosyncrasies Using Fuzzy Phylogenetic Profiles, PLoS ONE, vol.10, issue.1, p.52854, 2013.
DOI : 10.1371/journal.pone.0052854.s003

C. Fraser, J. Gocayne, O. White, M. Adams, and R. Clayton, The Minimal Gene Complement of Mycoplasma genitalium, Science, vol.270, issue.5235, pp.397-403, 1995.
DOI : 10.1126/science.270.5235.397

J. Glass, E. Lefkowitz, J. Glass, C. Heiner, and E. Chen, The complete sequence of the mucosal pathogen Ureaplasma urealyticum, Nature, vol.407, issue.6805, pp.757-762, 2000.
DOI : 10.1038/35037619

J. Ferretti, W. Mcshan, D. Ajdic, D. Savic, and G. Savic, Complete genome sequence of an M1 strain of Streptococcus pyogenes, Proceedings of the National Academy of Sciences, vol.98, issue.8, pp.4658-4663, 2001.
DOI : 10.1073/pnas.071559398

S. Shigenobu, H. Watanabe, M. Hattori, Y. Sakaki, and H. Ishikawa, Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp, Nature, vol.407, pp.81-86, 2000.

E. Waters, M. Hohn, I. Ahel, D. Graham, and M. Adams, The genome of Nanoarchaeum equitans: Insights into early archaeal evolution and derived parasitism, Proceedings of the National Academy of Sciences, vol.100, issue.22, pp.12984-12988, 2003.
DOI : 10.1073/pnas.1735403100