Space-efficient and exact de Bruijn graph representation based on a Bloom filter - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Lecture Notes in Computer Science Année : 2012

Space-efficient and exact de Bruijn graph representation based on a Bloom filter

Résumé

The de Bruijn graph data structure is widely used in next-generation sequencing (NGS). Many programs, e.g. de novo assemblers, rely on in-memory representation of this graph. However, current techniques for representing the de Bruijn graph of a human genome require a large amount of memory (> 30 GB). We propose a new encoding of the de Bruijn graph, which occupies an order of magnitude less space than current representations. The encoding is based on a Bloom filter, with an additional structure to remove critical false positives. An assembly software implementing this structure, Minia, performed a complete de novo assembly of human genome short reads using 5.7 Gb of memory in 23 hours.
Fichier principal
Vignette du fichier
minia.pdf (130.04 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00753930 , version 1 (19-11-2012)

Identifiants

Citer

Rayan Chikhi, Guillaume Rizk. Space-efficient and exact de Bruijn graph representation based on a Bloom filter. WABI 2012, Sep 2012, Ljubljana, Slovenia. pp 236-248, ⟨10.1007/978-3-642-33122-0_19⟩. ⟨hal-00753930⟩
339 Consultations
326 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More