GraphUnzip: unzipping assembly graphs with long reads and Hi-C - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2021

GraphUnzip: unzipping assembly graphs with long reads and Hi-C

Résumé

Long reads and Hi-C have revolutionized the field of genome assembly as they have made highly contiguous assemblies accessible even for challenging genomes. As haploid chromosome-level assemblies are now commonly achieved for all types of organisms, phasing assemblies has become the new frontier for genome reconstruction. Several tools have already been released using long reads and/or Hi-C to phase assemblies, but they all start from a set of linear sequences and are ill-suited for non-model organisms with high levels of heterozygosity. We present GraphUnzip, a fast, memory-efficient and flexible tool to unzip assembly graphs into their constituent haplotypes using long reads and/or Hi-C data. As GraphUnzip only connects sequences that already had a potential link in the assembly graph, it yields high-quality gap-less supercontigs. To demonstrate the efficiency of GraphUnzip, we tested it on the human HG00733 and the potato Solanum tuberosum. In both cases, GraphUnzip yielded phased assemblies with improved contiguity.
Fichier principal
Vignette du fichier
proceedings_GraphUnzip_jobim2021.pdf (290.3 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03441016 , version 1 (22-11-2021)

Identifiants

  • HAL Id : hal-03441016 , version 1

Citer

Roland Faure, Nadège Guiglielmoni, Jean-François Flot. GraphUnzip: unzipping assembly graphs with long reads and Hi-C. JOBIM 2021 - Journées Ouvertes en Biologie, Informatique et Mathématiques, Jul 2021, Paris, France. pp.1-7. ⟨hal-03441016⟩
244 Consultations
190 Téléchargements

Partager

Gmail Facebook X LinkedIn More