Parallel and memory-efficient reads indexing for genome assembly

Rayan Chikhi 1, * Guillaume Chapuis 1 Dominique Lavenier 1
* Auteur correspondant
1 SYMBIOSE - Biological systems and models, bioinformatics and sequences
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : As genomes, transcriptomes and meta-genomes are being sequenced at a faster pace than ever, there is a pressing need for efficient genome assembly methods. Two practical issues in assembly are heavy memory usage and long execution time during the read indexing phase. In this article, a parallel and memory-efficient method is proposed for reads indexing prior to assembly. Specifically, a hash-based structure that stores a reduced amount of read information is designed. Erroneous entries are filtered on the fly during index construction. A prototype implementation has been designed and applied to actual Illumina short reads. Benchmark evaluation shows that this indexing method requires significantly less memory than those from popular assemblers.
Type de document :
Communication dans un congrès
Parallel Bio-Computing 2011, Sep 2011, torun, Poland. 2011
Liste complète des métadonnées

Littérature citée [18 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00637536
Contributeur : Rayan Chikhi <>
Soumis le : mercredi 2 novembre 2011 - 11:56:44
Dernière modification le : mercredi 11 avril 2018 - 01:51:12
Document(s) archivé(s) le : vendredi 3 février 2012 - 02:25:47

Fichier

CP144.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00637536, version 1

Citation

Rayan Chikhi, Guillaume Chapuis, Dominique Lavenier. Parallel and memory-efficient reads indexing for genome assembly. Parallel Bio-Computing 2011, Sep 2011, torun, Poland. 2011. 〈inria-00637536〉

Partager

Métriques

Consultations de la notice

630

Téléchargements de fichiers

401