Skip to Main content Skip to Navigation
Journal articles

MindTheGap: integrated detection and assembly of short and long insertions

Abstract : Motivation: Insertions play an important role in genome evolution. However, such variants are difficult to detect from short read sequencing data, especially when they exceed the paired-end insert size. Many approaches have been proposed to call short insertion variants based on paired-end mapping. However, there remains a lack of practical methods to detect and assemble long variants. Results: We propose here an original method, called MINDTHEGAP, for the integrated detection and assembly of insertion variants from re-sequencing data. Importantly, it is designed to call insertions of any size, whether they are novel or duplicated, homozygous or heterozygous in the donor genome. MINDTHEGAP uses an efficient k-mer based method to detect insertion sites in a reference genome, and subsequently assemble them from the donor reads. MINDTHEGAP showed high recall and precision on simulated datasets of various genome complexities. When applied to real C. elegans and human NA12878 datasets, MINDTHEGAP detected and correctly assembled insertions longer than 1 kb, using at most 14 GB of memory.Availability:
Document type :
Journal articles
Complete list of metadata

Cited literature [22 references]  Display  Hide  Download
Contributor : Claire Lemaitre Connect in order to contact the contributor
Submitted on : Thursday, November 6, 2014 - 9:54:10 PM
Last modification on : Thursday, January 20, 2022 - 4:13:13 PM
Long-term archiving on: : Saturday, February 7, 2015 - 11:30:30 AM


Files produced by the author(s)


Distributed under a Creative Commons Attribution - NonCommercial 4.0 International License



Guillaume Rizk, Anaïs Gouin, Rayan Chikhi, Claire Lemaitre. MindTheGap: integrated detection and assembly of short and long insertions. Bioinformatics, Oxford University Press (OUP), 2014, 30 (24), pp.3451 - 3457. ⟨10.1093/bioinformatics/btu545⟩. ⟨hal-01081089⟩



Les métriques sont temporairement indisponibles