Skip to Main content Skip to Navigation
Theses

Structural variant genotyping with long read data

Lolita Lecompte 1, 2
2 GenScale - Scalable, Optimized and Parallel Algorithms for Genomics
Inria Rennes – Bretagne Atlantique , IRISA-D7 - GESTION DES DONNÉES ET DE LA CONNAISSANCE
Abstract : Structural Variants (SVs) are genomic rearrangements of more than 50 base pairs. Since SVs can reach several thousand base pairs, they can have huge impacts on genome functions, studying SVs is, therefore, of great interest. Recently, a new generation of sequencing technologies has been developed and produce long read data of tens of thousand of base pairs which are particularly useful for spanning over SV breakpoints. So far, bioinformatics methods have focused on the SV discovery problem with long read data. However, no method has been proposed to specifically address the issue of genotyping SVs with long read data. The purpose of SV genotyping is to assess for each variant of a given input set which alleles are present in a newly sequenced sample. This thesis proposes a new method for genotyping SVs with long read data, based on the representation of each allele sequences. We also defined a set of conditions to consider a read as supporting an allele. Our method has been implemented in a tool called SVJedi. Our tool has been validated on both simulated and real human data and achieves high genotyping accuracy. We show that SVJedi obtains better performances than other existing long read genotyping tools and we also demonstrate that SV genotyping is considerably improved with SVJedi compared to other approaches, namely SV discovery and short read SV genotyping approaches.
Document type :
Theses
Complete list of metadata

https://tel.archives-ouvertes.fr/tel-03082460
Contributor : Lolita Lecompte <>
Submitted on : Friday, December 18, 2020 - 3:53:19 PM
Last modification on : Friday, April 9, 2021 - 3:13:24 AM

File

Lolita_Lecompte.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-03082460, version 1

Citation

Lolita Lecompte. Structural variant genotyping with long read data. Computer Science [cs]. Université de Rennes 1 (UR1), 2020. English. ⟨tel-03082460v1⟩

Share

Metrics

Record views

51

Files downloads

103