28560 articles – 22061 references  [version française]

inria-00637536, version 1

Parallel and memory-efficient reads indexing for genome assembly

Rayan Chikhi (Author to contact preferably) 1, Guillaume Chapuis () a1, Dominique Lavenier () 1

Parallel Bio-Computing 2011 (2011)

Abstract: As genomes, transcriptomes and meta-genomes are being sequenced at a faster pace than ever, there is a pressing need for efficient genome assembly methods. Two practical issues in assembly are heavy memory usage and long execution time during the read indexing phase. In this article, a parallel and memory-efficient method is proposed for reads indexing prior to assembly. Specifically, a hash-based structure that stores a reduced amount of read information is designed. Erroneous entries are filtered on the fly during index construction. A prototype implementation has been designed and applied to actual Illumina short reads. Benchmark evaluation shows that this indexing method requires significantly less memory than those from popular assemblers.

  • a –  École normale supérieure de Cachan - ENS Cachan
  • 1:  SYMBIOSE (INRIA - IRISA)
  • CNRS : UMR6074 – INRIA – Institut National des Sciences Appliquées (INSA) - Rennes – Université de Rennes 1
  • Domain : Computer Science/Bioinformatics
    Life Sciences/Quantitative Methods
 
  • inria-00637536, version 1
  • oai:hal.inria.fr:inria-00637536
  • From: 
  • Submitted on: Wednesday, 2 November 2011 11:56:44
  • Updated on: Monday, 7 November 2011 09:12:37