QuickDeconvolution: fast and scalable deconvolution of linked-reads sequencing data - Archive ouverte HAL Access content directly
Master Thesis Year : 2021

QuickDeconvolution: fast and scalable deconvolution of linked-reads sequencing data

(1)
1

Abstract

Linked reads technologies, such as the 10X chromium system, use microfluidics to tag multiple short reads coming from the same long (50-200 kbp) fragment with a small sequence, called barcode. They are cheap and easy to prepare. The fact that reads with the same barcode come from the same fragment of the genome is extremely rich in information and can be used in a myriad of software. However, the same barcode may be used several times for several different fragments, complicating the analyses. Here we present QuickDeconvolution (QD), a new software capable of deconvoluting a set of reads sharing a barcode, i.e. telling separating reads coming from the different fragments. QD only takes as input the sequencing data, without the need for a reference genome. Compared to existing software, we show on made-up examples that QuickDeconvolution is more precise and faster than existing software, especially with many threads. More importantly, it is more scalable and therefore capable of deconvolving datasets that were inaccessible to previous software. We demonstrate here the first example in the literature of a successfully deconvolved animal genome, a Drosophila melanogaster dataset of 33 Gbp.
Fichier principal
Vignette du fichier
rapport_de_stage_QuickDeconvolution.pdf (491.5 Ko) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03479233 , version 1 (14-12-2021)

Identifiers

  • HAL Id : hal-03479233 , version 1

Cite

Roland Faure. QuickDeconvolution: fast and scalable deconvolution of linked-reads sequencing data. Bioinformatics [q-bio.QM]. 2021. ⟨hal-03479233⟩
31 View
54 Download

Share

Gmail Facebook Twitter LinkedIn More