Skip to Main content Skip to Navigation
Journal articles

SimkaMin: fast and resource frugal de novo comparative metagenomics

Abstract : Motivation: De novo comparative metagenomics is one of the most straightforward ways to analyze large sets of metagenomic data. Latest methods use the fraction of shared k-mers to estimate genomic similarity between read sets. However, those methods, while extremely efficient, are still limited by computational needs for practical usage outside of large computing facilities. Results: We present SimkaMin, a quick comparative metagenomics tool with low disk and memory footprints, thanks to an efficient data subsampling scheme used to estimate Bray-Curtis and Jaccard dissimilarities. One billion metagenomic reads can be analyzed in <3 min, with tiny memory (1.09 GB) and disk (approximate to 0.3 GB) requirements and without altering the quality of the downstream comparative analyses, making of SimkaMin a tool perfectly tailored for very large-scale metagenomic projects.
Document type :
Journal articles
Complete list of metadata

Cited literature [6 references]  Display  Hide  Download
Contributor : Claire Lemaitre Connect in order to contact the contributor
Submitted on : Tuesday, October 8, 2019 - 12:17:09 PM
Last modification on : Tuesday, January 4, 2022 - 6:40:59 AM



Gaëtan Benoit, Mahendra Mariadassou, Stéphane Robin, Sophie Schbath, Pierre Peterlongo, et al.. SimkaMin: fast and resource frugal de novo comparative metagenomics. Bioinformatics, Oxford University Press (OUP), 2020, 36 (4), pp.1-2. ⟨10.1093/bioinformatics/btz685⟩. ⟨hal-02308101⟩



Les métriques sont temporairement indisponibles