Skip to Main content Skip to Navigation
New interface
Conference papers

LEVIATHAN: efficient discovery of large structural variants by leveraging long-range information from Linked-Reads data

Abstract : Linked-Reads technologies, popularized by 10x Genomics, combine the highquality and low cost of short-reads sequencing with a long-range information by adding barcodes that tag reads originating from the same long DNA fragment. Thanks to their high-quality and long-range information, such reads are thus particularly useful for various applications such as genome scaffolding and structural variant calling. As a result, multiple structural variant calling methods were developed within the last few years. However, these methods were mainly tested on human data, and do not run well on non-human organisms, for which reference genomes are highly fragmented, or sequencing data display high levels of heterozygosity. Moreover, even on human data, most tools still require large amounts of computing resources. We present LEVIATHAN, a new structural variant calling tool that aims to address these issues, and especially better scale and apply to a wide variety of organisms. Our method relies on a barcode index, that allows to quickly compare the similarity of all possible pairs of regions in terms of amount of common barcodes. Region pairs sharing a sufficient number of barcodes are then considered as potential structural variants, and complementary, classical short reads methods are applied to further refine the breakpoint coordinates. Our experiments on simulated data underline that our method compares well to the state-of-the-art, both in terms of recall and precision, and also in terms of resource consumption. Moreover, LEVIATHAN was successfully applied to a real dataset from a non-model organism, while all other tools either failed to run or required unreasonable amounts of resources. LEVIATHAN is implemented in C++, supported on Linux platforms, and available under AGPL-3.0 License at https://github.com/morispi/LEVIATHAN.
Document type :
Conference papers
Complete list of metadata

https://hal.inria.fr/hal-03441874
Contributor : Claire Lemaitre Connect in order to contact the contributor
Submitted on : Monday, November 22, 2021 - 8:31:10 PM
Last modification on : Friday, August 5, 2022 - 2:54:52 PM
Long-term archiving on: : Wednesday, February 23, 2022 - 8:43:23 PM

File

JOBIM2021_paper_21.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-03441874, version 1

Citation

Pierre Morisse, Fabrice Legeai, Claire Lemaitre. LEVIATHAN: efficient discovery of large structural variants by leveraging long-range information from Linked-Reads data. JOBIM 2021 - Journées Ouvertes en Biologie, Informatique et Mathématiques, Jul 2021, Paris, France. pp.1-8. ⟨hal-03441874⟩

Share

Metrics

Record views

33

Files downloads

74