Detecting Mutations by eBWT

Abstract : In this paper we develop a theory describing how the extended Burrows-Wheeler Transform (eBWT) of a collection of DNA fragments tends to cluster together the copies of nucleotides sequenced from a genome G. Our theory accurately predicts how many copies of any nucleotide are expected inside each such cluster, and how an elegant and precise LCP array based procedure can locate these clusters in the eBWT. Our findings are very general and can be applied to a wide range of different problems. In this paper, we consider the case of alignment-free and reference-free SNPs discovery in multiple collections of reads. We note that, in accordance with our theoretical results, SNPs are clustered in the eBWT of the reads collection, and we develop a tool finding SNPs with a simple scan of the eBWT and LCP arrays. Preliminary results show that our method requires much less coverage than state-of-the-art tools while drastically improving precision and sensitivity.
Type de document :
Communication dans un congrès
Workshop on Algorithms in Bioinformatics, 2018, Helsinki, Finland. 〈10.4230/LIPIcs.CVIT.2016.23〉
Liste complète des métadonnées

https://hal.inria.fr/hal-01925950
Contributeur : Marie-France Sagot <>
Soumis le : dimanche 18 novembre 2018 - 16:16:25
Dernière modification le : mardi 20 novembre 2018 - 01:17:53
Document(s) archivé(s) le : mardi 19 février 2019 - 12:56:24

Fichier

1805.01876.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Nicola Prezza, Nadia Pisanti, Marinella Sciortino, Giovanna Rosone. Detecting Mutations by eBWT. Workshop on Algorithms in Bioinformatics, 2018, Helsinki, Finland. 〈10.4230/LIPIcs.CVIT.2016.23〉. 〈hal-01925950〉

Partager

Métriques

Consultations de la notice

8

Téléchargements de fichiers

10