Parallel and memory-efficient reads indexing for genome assembly

Rayan Chikhi; Guillaume Chapuis; Dominique Lavenier

Communication Dans Un Congrès Année : 2011

Parallel and memory-efficient reads indexing for genome assembly

(1) , (1) , (1)

Rayan Chikhi

Fonction : Auteur correspondant
PersonId : 14839
IdHAL : rayan-chikhi
ORCID : 0000-0003-1099-8735
IdRef : 16546769X

Connectez-vous pour contacter l'auteur

Biological systems and models, bioinformatics and sequences

Guillaume Chapuis

Fonction : Auteur
PersonId : 913120

Biological systems and models, bioinformatics and sequences

Dominique Lavenier

Fonction : Auteur
PersonId : 1401
IdHAL : dominique-lavenier
ORCID : 0000-0003-2557-680X

Biological systems and models, bioinformatics and sequences

Résumé

As genomes, transcriptomes and meta-genomes are being sequenced at a faster pace than ever, there is a pressing need for efficient genome assembly methods. Two practical issues in assembly are heavy memory usage and long execution time during the read indexing phase. In this article, a parallel and memory-efficient method is proposed for reads indexing prior to assembly. Specifically, a hash-based structure that stores a reduced amount of read information is designed. Erroneous entries are filtered on the fly during index construction. A prototype implementation has been designed and applied to actual Illumina short reads. Benchmark evaluation shows that this indexing method requires significantly less memory than those from popular assemblers.

Domaines

Bio-informatique [q-bio.QM] Bio-Informatique, Biologie Systémique [q-bio.QM]

Fichier principal

CP144.pdf (243.29 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Rayan Chikhi : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00637536

Soumis le : mercredi 2 novembre 2011-11:56:44

Dernière modification le : vendredi 24 mars 2023-14:52:55

Archivage à long terme le : vendredi 3 février 2012-02:25:47

Dates et versions

inria-00637536 , version 1 (02-11-2011)

Identifiants

HAL Id : inria-00637536 , version 1

Citer

Rayan Chikhi, Guillaume Chapuis, Dominique Lavenier. Parallel and memory-efficient reads indexing for genome assembly. Parallel Bio-Computing 2011, Sep 2011, torun, Poland. ⟨inria-00637536⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA IRISA-D7 INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES INSA-GROUPE UR1-MATH-NUM

291 Consultations

390 Téléchargements

Parallel and memory-efficient reads indexing for genome assembly

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager