SimkaMin: fast and resource frugal de novo comparative metagenomics - Archive ouverte HAL Access content directly
Journal Articles Bioinformatics Year : 2020

SimkaMin: fast and resource frugal de novo comparative metagenomics

(1) , (2, 3) , (2, 4) , (2, 3) , (1) , (1)
1
2
3
4

Abstract

Motivation: De novo comparative metagenomics is one of the most straightforward ways to analyze large sets of metagenomic data. Latest methods use the fraction of shared k-mers to estimate genomic similarity between read sets. However, those methods, while extremely efficient, are still limited by computational needs for practical usage outside of large computing facilities. Results: We present SimkaMin, a quick comparative metagenomics tool with low disk and memory footprints, thanks to an efficient data subsampling scheme used to estimate Bray-Curtis and Jaccard dissimilarities. One billion metagenomic reads can be analyzed in <3 min, with tiny memory (1.09 GB) and disk (approximate to 0.3 GB) requirements and without altering the quality of the downstream comparative analyses, making of SimkaMin a tool perfectly tailored for very large-scale metagenomic projects.
Fichier principal
Vignette du fichier
main.pdf (318.25 Ko) Télécharger le fichier
Vignette du fichier
btz685_supplementary_data.pdf (288.67 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

hal-02308101 , version 1 (08-10-2019)

Identifiers

Cite

Gaëtan Benoit, Mahendra Mariadassou, Stéphane Robin, Sophie Schbath, Pierre Peterlongo, et al.. SimkaMin: fast and resource frugal de novo comparative metagenomics. Bioinformatics, 2020, 36 (4), pp.1-2. ⟨10.1093/bioinformatics/btz685⟩. ⟨hal-02308101⟩
251 View
408 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More