E. S. Lander, Initial sequencing and analysis of the human genome, Nature, vol.409, pp.860-921

M. D. Adams, The genome sequence of Drosophila melanogaster, Science, vol.287, pp.36-8075, 2000.

Z. D. Stephens, Big Data : Astronomical or Genomical ?, PLoS Biology, vol.13, pp.1544-9173, 2015.

G. Moore, Cramming More Components Onto Integrated Circuits, Proceedings of the IEEE, vol.86, pp.1558-2256, 1998.

R. Leinonen, H. Sugawara, and M. Shumway, The Sequence Read Archive, Nucleic Acids Research, vol.39, 2011.

V. Schildgen and O. Schildgen, How is a molecular polymorphism defined ? Cancer, vol.119, 2013.

L. Janin, O. Schulz-trieglaff, and A. J. Cox, BEETL-fastq : a searchable compressed archive for DNA reads, Bioinformatics, vol.30, pp.1460-2059, 2014.

D. D. Dolle, Using reference-free compressed data structures to analyze sequencing reads from thousands of human genomes, Genome Research, vol.27, pp.1549-5469

, , 2017.

P. Ferragina and G. Manzini, Opportunistic data structures with applications in, Proceedings 41st Annual Symposium on Foundations of Computer Science Proceedings 41st Annual Symposium on Foundations of Computer Science, pp.390-398, 2000.

M. J. Burrows, D. Wheeler, and . Block, Sorting Lossless Data Compression Algorithm, Digital Systems Research Center Research Reports, vol.1, 1995.

N. G. Bruijn, A combinatorial problem, vol.7

M. G. Grabherr, Full-length transcriptome assembly from RNA-Seq data without a reference genome, Nature Biotechnology, vol.29, pp.1546-1696

B. J. Haas, De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis, Nature Protocols, vol.8, pp.1750-2799, 2013.

B. Li, Evaluation of de novo transcriptome assemblies from RNA-Seq data, Genome Biology, vol.15, 2014.

R. Wittler, Alignment-and reference-free phylogenomics with colored de-Bruijn graphs, 2019.

R. Uricaru, Reference-free detection of isolated SNPs, Nucleic Acids Research, vol.43, pp.305-1048
URL : https://hal.archives-ouvertes.fr/hal-01083715

T. F. Smith and M. S. Waterman, Identification of common molecular subsequences, Journal of Molecular Biology, vol.147, pp.22-2836, 1981.

B. H. Bloom, Space/time trade-offs in hash coding with allowable errors, Communications of the ACM, vol.13, 1970.

P. Bradley, H. C. Bakker, E. P. Rocha, G. Mcvean, and Z. Iqbal, Ultrafast search of all deposited bacterial and viral genomic data, Nature Biotechnology, vol.37, pp.1546-1696, 2019.

T. Bingmann, P. Bradley, F. Gauger, and Z. Iqbal, COBS : a Compact Bit-Sliced Signature Index, pp.23-2019, 2019.

P. Pandey, Mantis : A Fast, Small, and Exact Large-Scale Sequence-Search Index, Cell Systems, vol.7, pp.2405-4712

P. Pandey, M. A. Bender, R. Johnson, and R. Patro, Squeakr : an exact and approximate k-mer counting system, Bioinformatics, vol.34, pp.1367-4803, 2018.

P. Pandey, M. A. Bender, R. Johnson, R. Patro, and . General, Purpose Counting Filter : Making Every Bit Count in Proceedings of the 2017 ACM International Conference on Management of Data, 2017.

M. D. Muggli, Succinct colored de Bruijn graphs, Bioinformatics, vol.33, pp.1460-2059, 2017.

X. Liu, A novel data structure to support ultra-fast taxonomic classification of metagenomic sequences with k-mer signatures, Bioinformatics, vol.34, pp.1367-4803, 2018.

Y. Yu, D. Belazzougui, C. Qian, and Q. Zhang, Memory-Efficient and Ultra-Fast Network Lookup and Forwarding Using Othello Hashing, IEEE/ACM Transactions on Networking, vol.26, pp.1063-6692, 2018.

A. Limasset, G. Rizk, R. Chikhi, and P. Peterlongo, Fast and scalable minimal perfect hashing for massive key sets, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01566246

B. Solomon and C. Kingsford, Fast search of thousands of short-read sequencing experiments, Nature Biotechnology, vol.34, pp.1546-1696, 2016.

R. W. Hamming, Error detecting and error correcting codes, The Bell System Technical Journal, vol.29

B. Solomon and C. Kingsford, Improved Search of Large Transcriptomic Sequencing Databases Using Split Sequence Bloom Trees, Journal of Computational Biology, vol.25, pp.1557-8666

, , 2018.

C. Sun, R. S. Harris, R. Chikhi, and P. Medvedev, AllSome Sequence Bloom Trees, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01575350

R. S. Harris and P. Medvedev, Improved Representation of Sequence Bloom Trees, 2019.

V. Mäkinen and G. Navarro, Rank and select revisited and extended, Theoretical Computer Science, vol.387, 2007.

R. Raman, V. Raman, and S. R. Satti, Succinct Indexable Dictionaries with Applications to Encoding $k$-ary Trees, Prefix Sums and Multisets, ACM Transactions on Algorithms, vol.3, p.15496325, 2007.

I. Turner, K. V. Garimella, Z. Iqbal, and G. Mcvean, Integrating long-range connectivity information into de Bruijn graphs, Bioinformatics, vol.34, issue.1, pp.1367-4803, 2018.

, Explore to understand, share to bring about change, 2019.

T. Oceans-coordinators, A global ocean atlas of eukaryotic genes, Nature Communications, vol.9, pp.2041-1723

E. Villar, The Ocean Gene Atlas : exploring the biogeography of plankton genes online, Nucleic Acids Research, vol.46, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01803597

T. O. Delmont, Nitrogen-fixing populations of Planctomycetes and Proteobacteria are abundant in surface ocean metagenomes, Nature Microbiology, vol.3, pp.2058-5276, 2018.

G. Marçais and C. Kingsford, A fast, lock-free approach for efficient parallel counting of occurrences of k-mers, Bioinformatics, vol.27, pp.1367-4803, 2011.

G. Benoit, Multiple comparative metagenomics using multiset k -mer counting, PeerJ Computer Science, vol.2, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01300485

E. Drezen, Genome Assembly & Analysis Tool Box. Bioinformatics, vol.30, pp.1367-4803, 2014.

G. Rizk, D. Lavenier, and R. Chikhi, DSK : k-mer counting with very low memory usage, Bioinformatics, vol.29, pp.1367-4803, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00778473

A. M. Eren, Anvi'o : an advanced analysis and visualization platform for 'omics data, PeerJ, vol.3, pp.2167-8359, 1319.