.. .. Contexte,

.. .. Travaux,

. .. Chaîne-de-traitement, 142 6.2.1 Vue d'ensemble

. .. Calcul-des-chevauchements, , p.146

, Calcul des piles de chevauchements et des fenêtres, p.147

. .. Calcul-du-consensus-d'une-fenêtre, 148 6.2.5.2 Généralisation de la stratégie de segmentation . . 148 6.2.5.3 Raffinement du consensus avec un graphe de de Bruijn, p.151

.. .. Sortie,

. .. Résultats, 154 6.3.4 Comparaison à l'état de l'art sur données simulées

.. .. Synthèse,

, Contexte Les technologies de séquençage de troisième génération ont grandement évolué depuis leur introduction en 2011. En particulier, les taux d'erreurs des reads, atteignant 15 à 30% lors des premières expériences de séquençage, ont été grandement 7.1 Introduction

, 3.1.2 Profondeur de séquençage de 60x, p.192

. .. Données-réelles,

. .. Conclusion, , p.206

. .. Perspectives-générales, , p.210

, Ce chapitre présente la conclusion générale de cette thèse, ainsi que ses perspectives, proposées aussi bien à l'échelle des différents résultats décrits

M. Pierre, L. Thierry, and L. Arnaud, HG-CoLoR : enHanced de Bruijn Graph for the error Correction of Long Reads Seqbio, Informatique et Mathématiques (JOBIM), 2017.

M. Pierre, M. Camille, L. Antoine, L. Thierry, and L. Arnaud, CONSENT : Scalable self-correction of long reads with multiple sequence alignment, Informatique et Mathématiques (JOBIM), 2019.

, Communications dans des workshops internationaux

M. Pierre, L. Thierry, and L. Arnaud, Enhanced de Bruijn Graphs, Mathematic foundations in Bioinformatics (MatBio), 2017.

M. Pierre, L. Thierry, and L. Arnaud, Hybrid correction of long reads using a variable-order de Bruijn graph, Data Structures in Bioinformatics (DSB), 2018.

, Liste des publications et communications

M. Pierre, M. Camille, L. Antoine, L. Thierry, and L. Arnaud, CONSENT : Scalable self-correction of long reads with multiple sequence alignment, Data Structures in Bioinformatics (DSB), 2019.

, Communications dans des workshops nationaux

M. Camille, M. Pierre, L. Lolita, L. Antoine, and L. Arnaud, ELECTOR : EvaLuator of Error Correction Tools for lOng Reads, SeqBio, 2018.

M. Pierre, L. Thierry, and L. Arnaud, HG-CoLoR : enHanced de Bruijn Graph for the error Correction of Long Reads, 2017.

M. Pierre, L. Antoine, M. Camille, L. Arnaud, and P. Pierre, , 2018.

, Conférences et séminaires invité

M. Pierre, Correction de données de séquençage de troisième génération. Séminaire d'informatique théorique, 2019.

M. Pierre, Diverses approches pour l'auto-correction des lectures longues, Séminaire Symbiose de l'Inria, 2017.

M. Camille, M. Pierre, L. Lolita, L. Antoine, and L. Arnaud, ELECTOR : EvaLuation of Error Correction Tools for lOng Reads, Informatique et Mathématiques (JOBIM), 2018.

. Prépublications,

M. Camille, M. Pierre, L. Lolita, L. Antoine, and L. Arnaud, ELECTOR : Evaluator for long reads correction methods. bioRxiv, 2019.

M. Pierre, M. Camille, L. Antoine, L. Thierry, and L. Arnaud, CONSENT : Scalable self-correction of long reads with multiple sequence alignment, 2019.

A. Amin, K. Panos, and S. Victor, Karect : accurate correction of substitution, insertion and deletion errors for next-generation sequencing data, Bioinformatics, vol.31, pp.3421-3428, 2015.

W. S-f-altschul, W. Gish, E. Miller, D. Myers, and . Lipman, Basic local alignment search tool, Journal of Molecular biology, vol.215, pp.403-413, 1990.

E. L. Anson and E. W. Myers, ReAligner : A Program for Refining DNA Sequence Multi-Alignments, Journal of Computational Biology, vol.4, pp.369-383, 2009.

K. Fai, A. U. , J. G. Underwood, L. L. Wing-hung, and W. , Improving PacBio Long Read Accuracy by Short Read Alignment, PLoS ONE, vol.7, issue.10, pp.1-8, 2012.

B. Anton, N. Sergey, A. Dmitry, A. Alexey, . Gurevich et al., SPAdes : A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing, Journal of Computational Biology, vol.19, pp.455-477, 2012.

B. Ergude and L. Lingxiao, HALC : High throughput algorithm for long read error correction, BMC Bioinformatics, vol.18, p.204, 2017.

B. Ergude, X. Fei, S. Changjin, and S. Et-dandan, FLAS : fast and high-throughput algorithm for PacBio long-read self-correction, Bioinformatics, 2019.

E. Leonard and . Baum, An Inequality and Associated Maximization Technique in Statistical Estimation for Probabilistic Functions of Markov Processes. Inequalities III : Proceedings of the Third Symposium on Inequalities, pp.1-8, 1972.

B. Konstantin, K. Sergey, C. Chen-shan, P. James, J. M. Drake et al., Assembling large genomes with single-molecule sequencing and locality-sensitive hashing, Nature Biotechnology, vol.33, pp.623-630, 2015.

C. Boucher, A. Bowe, G. Travis, J. Simon, . Puglisi et al., Variable-Order De Bruijn Graphs, Proceedings of the 2015 Data Compression Conference, pp.383-392, 2015.

B. Alexander, O. Taku, S. Kunihiko, and S. Tetsuo, Succinct de Bruijn Graphs, Algorithms in Bioinformatics : 12th International Workshop, WABI 2012, pp.225-235, 2012.

. Bibliographie,

B. Coen and K. Joep, Algorithm 457 : Finding All Cliques of an Undirected Graph, Communications of the ACM, vol.16, pp.575-577, 1973.

N. Govert-de and . Bruijn, A combinatorial problem, Proceedings of the Section of Sciences of the Koninklijke Nederlandse Akademie van Wetenschappen te Amsterdam, vol.7, pp.758-764, 1946.

M. Burrows and W. Dj, A block-sorting lossless data compression algorithm, 1994.

C. Ségolène, A. Christophe, Y. L. David, and H. , Comparison of mapping algorithms used in high-throughput sequencing : Application to Ion Torrent data, BMC Genomics, vol.15, pp.1-16, 2014.

J. Mark, . Chaisson, and T. Glenn, Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR) : application and theory, BMC Bioinformatics, vol.13, p.238, 2012.

C. Rayan and R. Guillaume, Space-efficient and exact de Bruijn graph representation based on a Bloom filter, Algorithms for Molecular Biology, vol.2, pp.1-9, 2013.

C. Shan, C. , P. Peluso, F. J. Sedlazeck, M. N. et al., Phased diploid genome assembly with singlemolecule real-time sequencing, Nature Methods, vol.13, pp.1050-1054, 2016.

C. Chen-shan, H. David, P. Alexander, . Marks, A. Aaron et al., finished microbial genome assemblies from long-read SMRT sequencing data, Nature Methods, vol.10, pp.563-569, 2013.

C. Olivia, A. Chakrabarty, J. Scott, and . Emrich, HECIL : A hybrid error correction algorithm for long reads with iterative learning, Scientific Reports, vol.8, issue.1, pp.1-9, 2018.

J. Peter, C. J. Cock, . Fields, G. Naohisa, M. L. Heuer et al., The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants, Nucleic Acids Research, vol.38, pp.1767-1771, 2009.

D. Matei, D. Misko, L. Dan, L. I. Michael, and B. , SHRiMP2 : Sensitive yet practical short read mapping, Bioinformatics, vol.27, pp.1011-1012, 2011.

E. W. Dijkstra, A Note on Two Problems in Connexion with Graphs, Numerische Mathematik, vol.1, issue.1, pp.269-271, 1959.

E. David, L. Maarten, and S. Darren, Listing All Maximal Cliques in Sparse Graphs in Near-Optimal Time. Algorithms and Computation, pp.403-414, 2010.

E. David and S. Darren, Listing All Maximal Cliques in Large Sparse Real-World Graphs. Experimental Algorithms, pp.364-375, 2011.

B. Ewing, L. Hillier, and M. Wendl, Base-Calling of Automated Sequencer Traces Using Phred. I. Accuracy Assessment, Genome Research, vol.8, pp.186-194, 1998.

B. Ewing, L. Hillier, and M. Wendl, Base-Calling of Automated Sequencer Traces Using Phred. II. Error Probabilities, Genome Research, vol.8, pp.186-194, 1998.

P. Ferragina and G. Manzini, Opportunistic data structures with applications, Proceedings of the 41st Annual Symposium on Foundations of Computer Science FOCS '00, pp.390-398, 2000.

F. Can, B. Ziv, A. Can, and C. Ercument, Hercules : a profile HMM-based hybrid error correction algorithm for long reads, Nucleic Acids Research, vol.46, 2018.

W. Robert and . Floyd, Algorithm 97 : Shortest Path, Communications of the ACM, vol.5, p.345, 1962.

M. Frazier, R. Gibbs, D. Muzny, S. E. Scherer, and J. Bouck, Initial sequencing and analysis of the human genome, Nature, vol.409, pp.860-921, 2001.

I. Good, Normal Recurring Decimals, Journal of the London Mathematical Society, pp.167-169, 1946.

G. Sara, G. James, E. Scott, D. Panchajanya, C. Michael et al., Oxford Nanopore sequencing, hybrid error correction, and de novo assembly of a eukaryotic genome, Genome Research, vol.25, pp.1750-1756, 2015.

S. Goodwin, J. D. Mcpherson, W. Richard, and M. , Coming of age : Ten years of next-generation sequencing technologies, Nature Reviews Genetics, vol.17, pp.333-351, 2016.

G. Kazuyoshi, Y. Teruo, I. Takamasa, M. Daisuke, and Y. Kazutoshi, Performance comparison of second and thirdgeneration sequencers using a bacterial genome with two chromosomes, BMC Genomics, vol.15, p.699, 2014.

C. Grasso and L. Christopher, Combining partial order alignment and progressive multiple sequence alignment increases alignment speed and scalability to very large alignment problems, Bioinformatics, vol.20, pp.1546-1556, 2004.

G. Miodrag, The history of DNA sequencing, Journal of Medical Biochemistry, vol.32, pp.301-312, 2013.

H. Thomas, H. Rainer, J. Schultz, and F. Frank, Proovread : Large-scale high-accuracy PacBio correction through iterative short read consensus, Bioinformatics, vol.30, pp.3004-3011, 2014.

H. Ehsan, H. Faraz, C. Et-cedric, and . Colormap, Correcting Long Reads by Mapping short reads, Bioinformatics, vol.32, pp.545-551, 2016.

H. Yun, Improving quality of high-throughput sequencing reads, 2015.

H. Yun, X. Long, W. U. Deming, C. Jian, M. A. et al., BLESS : Bloom filter-based error correction solution for high-throughput sequencing reads, Bioinformatics, vol.30, pp.1354-1362, 2014.

H. U. Ruifeng, S. Guibo, and S. Xiaobo, LSCplus : a fast solution for improving long read accuracy by short read alignment, BMC Bioinformatics, vol.17, p.451, 2016.

H. Weichun, L. I. Leping, R. Jason, . Myers, T. Gabor et al., ART : a next-generation sequencing read simulator, Bioinformatics, vol.28, pp.593-594, 2012.

I. Lucian, F. Farideh, and I. Silvana, HiTEC : Accurate error correction in high-throughput sequencing data, Bioinformatics, vol.27, pp.295-302, 2011.

D. Shaun, . Jackman, P. Benjamin, . Vandervalk, M. Hamid et al., ABySS 2 . 0 : Resource-Efficient Assembly of Large Genomes using a Bloom Filter Effect of Bloom Filter False Positive Rate, Genome Research, vol.27, pp.768-777, 2017.

J. Miten, K. Sergey, H. Karen, J. Q. Miga, C. Arthur et al., Nanopore sequencing and assembly of a human genome with ultra-long reads, Nature Biotechnology, vol.36, p.338, 2018.

S. J. Jones, I. Birol, J. T. Simpson, K. Wong, and S. D. Jackman, ABySS : A parallel assembler for short read sequence data, Genome Research, vol.19, pp.1117-1123, 2009.

W. Chun, K. , A. H. Chan, and Y. S. Song, ECHO : A reference-free short-read error correction algorithm, Genome Research, vol.21, pp.1181-1192, 2011.

K. Mehdi and E. Mourad, An Error Correction and De-Novo Assembly Approach for Nanopore Reads Using Short Reads, Current Bioinformatics, vol.13, pp.241-252, 2018.

K. Mehdi, J. Francois, G. Et-mourad, and E. , Generations of Sequencing Technologies : From First to Next Generation, 2017.

R. David, . Kelley, C. Michael, . Schatz, L. Steven et al., Quake : qualityaware detection and correction of sequencing errors, Genome Biology, vol.11, issue.11, 2010.

K. James, BLAT -The BLAST-Like Alignment Tool, Genome Research, vol.12, pp.656-664, 2002.

M. Szymon, . Kielbasa, W. Raymond, S. Kengo, M. Szymon et al., Adaptive seeds tame genomic sequence comparison, Genome Research, vol.21, pp.487-493, 2011.

K. Marek, D. Maciej, and D. Sebastian, KMC3 : counting and manipulating k-mer statistics, Bioinformatics, vol.33, pp.2759-2791, 2017.

K. Sergey, P. Gregory, T. P. Harhay, . Smith, L. James et al., Reducing assembly complexity of microbial genomes with single-molecule sequencing, Genome Biology, vol.14, issue.9

K. Sergey, C. Michael, . Schatz, P. Brian, . Walenz et al., Hybrid error correction and de novo assembly of singlemolecule sequencing reads, Nature Biotechnology, vol.30, pp.693-700, 2012.

K. Sergey, P. Brian, K. Walenz, J. R. Berlin, . Miller et al., Canu : scalable and accurate long-read assembly via adaptive k -mer weighting and repeat separation, Genome Research, vol.27, pp.722-736, 2017.

K. Tomasz, G. Szymon, and D. Sebastian, Indexing Arbitrary-Length k-Mers in Sequencing Reads, PLoS ONE, vol.10, pp.1-16, 2015.

K. Stefan, Reducing the space requirement of suffix trees, vol.13, pp.1149-1171, 1999.

K. Stefan, P. Adam, L. Arthur, M. Delcher, and . Smoot, Martin SHUMWAY et al. Versatile and open software for comparing large genomes, Genome Biology, vol.5, 2004.

L. A. Sean, H. Ehsan, and C. Cedric, LRCstats, a tool for evaluating long reads correction methods, Bioinformatics, vol.33, pp.3652-3654, 2017.

B. Langmead, C. Trapnell, M. Pop, and S. Salzberg, Ultrafast and memoryefficient alignment of short DNA sequences to the human genome, Genome Biology, vol.10, 2009.

L. Ben, L. Steven, and . Salzberg, Fast gapped-read alignment with Bowtie 2, Nat Methods, vol.9, pp.357-359, 2012.

L. Christopher, Generating consensus sequences from partial order multiple sequence alignment graphs, Bioinformatics, vol.19, pp.999-1008, 2003.

L. Christopher, C. Grasso, F. Mark, and . Sharlow, Multiple sequence alignment using partial order graphs, Bioinformatics, vol.18, pp.452-464, 2002.

L. Hayan, G. James, Y. Shinjae, M. Shoshana, W. Richard et al., Error correction and assembly complexity of single molecule sequencing reads, bioRxiv, p.6395, 2014.

L. I. Heng, Aligning sequence reads, clone sequences and assembly contigs with, 2013.

L. I. Heng, Minimap and miniasm : Fast mapping and de novo assembly for noisy long sequences, Bioinformatics, vol.32, pp.2103-2110, 2016.

L. I. Heng, Minimap2 : pairwise alignment for nucleotide sequences, Bioinformatics, vol.34, pp.3094-3100, 2018.

L. I. Heng and D. Richard, Fast and accurate long-read alignment with Burrows-Wheeler transform, Bioinformatics, vol.26, pp.589-595, 2010.

L. I. Heng and D. Richard, Fast and accurate short read alignment with Burrows-Wheeler transform, Bioinformatics, vol.25, pp.1754-1760, 2009.

L. I. Ruiqiang, Z. Hongmei, R. Jue, Q. Wubin, and F. Xiaodong, De novo assembly of human genomes with massively parallel short read sequencing, Genome Research, vol.20, pp.265-272, 2010.

L. I. Yu, H. Renmin, B. I. Chongwei, L. I. Mo, and W. Sheng, DeepSimulator : A deep simulator for Nanopore sequencing, Bioinformatics, vol.34, pp.2899-2908, 2018.

L. Leandro, M. Camille, C. Ségolène, D. A. Corinne, . Silva et al., Comparative assessment of long-read error-correction software applied to RNA-sequencing data, 2019.

L. Yu, A. Pavel, and . Pevzner, Manifold de Bruijn Graphs, Algorithms in Bioinformatics : 14th International Workshop, WABI 2014, pp.296-310, 2014.

L. Yongchao, J. Schröder, and S. Bertil, Musket : A multistage kmer spectrum-based error corrector for Illumina sequence data, Bioinformatics, vol.29, pp.308-315, 2013.

L. Thomas, . Madden, C. Christiam, M. A. Ning, C. George et al., BLAST+ : architecture and applications, BMC Bioinformatics, vol.10, p.421, 2009.

M. Mohammed-amin, E. Stefan, C. Corinne, B. Caroline, and L. Bertrand, Genome assembly using Nanopore-guided long and error-free DNA reads, BMC Genomics, vol.16, p.327, 2015.

M. Nicolas, C. Guillaume, V. Thomas, L. Dominique, and P. Pierre, Commet : Comparing and combining multiple metagenomic datasets, IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2014.

M. Udi and M. Gene, Suffix Arrays : A New Method for On-line String Searches, Proceedings of the First Annual ACM-SIAM Symposium on Discrete Algorithms SODA '90, pp.319-327, 1990.

G. Marcais and C. Kingsford, Jellyfish : A fast k-mer counter, pp.1-8, 2012.

M. Guillaume, A. James, and Z. Yorke-et-aleksey, QuorUM : An Error Corrector for Illumina Reads, PLoS ONE, vol.10, pp.1-13, 2015.

M. Camille, From reads to transcripts : de novo methods for the analysis of transcriptome second and third generation sequencing, 2018.

M. Pierre, C. Rayan, and V. Jean-stéphane, yacrd and fpa : upstream tools for long-read genome assembly, 2019.

W. Gilbert, M. Allan, and M. , A new method for sequencing DNA, Proceedings of The National Academy of Sciences of The United States Of America, vol.74, pp.99-103, 1977.

M. Giles, H. Mahdi, D. Piet, R. Stephane, and Y. Van-de-peer, Jabba : hybrid error correction for long sequencing reads, Algorithms for Molecular Biology, vol.11, p.10, 2016.

M. Alla, P. Andrey, A. Dmitry, S. Vladislav, and G. Et-alexey, Versatile genome assembly evaluation with QUAST-LG, Bioinformatics, vol.34, pp.142-150, 2018.

J. C. Mu, J. Hui, K. Amirhossein, M. Marghoob, N. Bani et al., Fast and accurate read alignment for resequencing, Bioinformatics, vol.28, pp.2366-2373, 2012.

E. W. Myers, G. Granger, A. L. Sutton, I. M. Delcher, D. P. Dew et al., A whole-genome assembly of Drosophila, Science, vol.287, pp.2196-2204, 2000.

M. Gene, Efficient Local Alignment Discovery amongst Noisy Long Reads, Algorithms in Bioinformatics, pp.52-67, 2014.

B. Saul, . Needleman, D. Christian, and . Wunsch, A general method applicable to the search for similarities in the amino acid sequence of two proteins, Journal of Molecular Biology, vol.48, pp.90057-90061, 1970.

P. Mihai, P. Adam, A. L. Delcher, L. Steven, and . Salzberg, Comparative genome assembly, Briefings in Bioinformatics, vol.5, pp.237-248, 2004.

A. Jason, . Reuter, V. Damek, . Spacek, P. Michael et al., High-throughput sequencing technologies, Molecular cell, vol.58, pp.586-597, 2015.

C. Daniel, F. Richter, A. F. Ott, R. Auch, . Schmid et al., MetaSim : A Sequencing Simulator for Genomics and Metagenomics. Handbook of Molecular Microbial Ecology I : Metagenomics and Complementary Approaches 3, vol.10, pp.417-421, 2011.

R. Guillaume, L. Dominique, and C. Rayan, DSK : K-mer counting with very low memory usage, Bioinformatics, vol.29, pp.652-653, 2013.

C. Flye and S. , Question 48. L'Intermédiaire des Mathématiciens, vol.1, pp.107-110, 1894.

S. Leena, Correction of sequencing errors in a mixed set of reads, Bioinformatics, vol.26, 2010.

L. Salmela, R. Eric, and . Lordec, Accurate and efficient long read error correction, Bioinformatics, vol.30, pp.3506-3514, 2014.
URL : https://hal.archives-ouvertes.fr/lirmm-01100451

S. Leena and S. Jan, Correcting errors in short reads by multiple alignments, Bioinformatics, vol.27, pp.1455-1461, 2011.

S. Leena, W. Riku, R. Eric, and U. Esko, Accurate selfcorrection of errors in long reads using de Bruijn graphs, Bioinformatics, vol.33, pp.799-806, 2017.

F. Sanger, S. Nicklen, and A. Coulson, DNA sequencing with chainterminating inhibitors, Proceedings of The National Academy of Sciences of The United States Of America, vol.74, pp.5463-5467, 1977.

C. Schensted, Longest Increasing and Decreasing Subsequences, Canadian Journal of Mathematics, vol.13, pp.179-191, 1961.

S. Jan, S. Heiko, J. Simon, . Puglisi, S. Ranjan et al., SHREC : A short-read error correction method, Bioinformatics, vol.25, pp.2157-2163, 2009.

J. Fritz, . Sedlazeck, L. Hayan, C. A. Darby, C. Michael et al., Piercing the dark matter : bioinformatics of long-range sequencing and mapping, Nature Reviews Genetics, vol.19, pp.329-346, 2018.

J. Fritz, . Sedlazeck, R. Philipp, S. Moritz, F. Han et al., Accurate detection of complex structural variations using single-molecule sequencing, Nature Methods, vol.15, pp.461-468, 2018.

S. Jay, B. Shankar, G. M. Church, G. Walter, and J. R. , DNA sequencing at 40 : Past, present and future, Nature, vol.550, pp.345-353, 2017.

S. Jay and J. I. Hanlee, Next-generation DNA sequencing, Nature Biotechnology, vol.26, pp.1135-1145, 2008.

T. Smith and M. Waterman, Identification of common molecular subsequences, Journal of Molecular Biology, vol.147, issue.81, pp.90087-90092, 1981.

B. K. Stöcker, K. Johannes, and R. Sven, SimLoRD : Simulation of Long Read Data, Bioinformatics. T, vol.32, pp.2704-2706, 2016.

T. German, W. Eugene, and . Myers, Non Hybrid Long Read Consensus Using Local De Bruijn Graph Assembly, 2017.

V. Robert, S. Ivan, N. Niranjan, and ?. Mile, Fast and accurate de novo genome assembly from long uncorrected reads, Genome Research, vol.27, pp.737-746, 2017.

A. Viterbi, Error bounds for convolutional codes and an asymptotically optimum decoding algorithm, IEEE Transactions on Information Theory, vol.13, pp.260-269, 1967.

G. A. Volkmer, G. P. Irzyk, X. V. Gomes, V. B. Makhi-jani, and G. T. Roth, Genome sequencing in microfabricated highdensity picolitre reactors, Nature, vol.437, pp.376-380, 2005.

V. Michaël, D. E. Bernard, . Baets, F. Veerle, and D. Peter, EssaMEM : Finding maximal exact matches using enhanced sparse suffix arrays, Bioinformatics, vol.29, pp.802-804, 2013.

J. R. Wang, J. Holt, L. Mcmillan, D. J. Corbin, and . Fmlrc, Hybrid long read error correction using an FM-index, BMC Bioinformatics, vol.19, pp.1-11, 2018.

W. Stephen, A Theorem on Boolean Matrices, Journal of the ACM, vol.9, pp.11-12, 1962.

J. Dewey, W. , F. Harry-compton, and C. , Molecular Structure of Nucleic Acids : A Structure for Deoxyribose Nucleic Acid, Nature, vol.171, pp.737-738, 1953.

Z. Gang, W. Shao-wu, and Z. , NPBSS : A new PacBio sequencing simulator for generating the continuous long reads with an empirical model, BMC Bioinformatics, vol.19, pp.1-9, 2018.

W. Peter, Linear pattern matching algorithms, Switching and Automata Theory, 1973. SWAT '08. IEEE Conference Record of 14th Annual Symposium on, pp.1-11, 1973.

. Wikimedia,

. Wikimedia,

. Wikimedia,

A. Wysoker, T. Fennell, G. Marth, G. Abecasis, and J. Ruan, The Sequence Alignment/Map format and SAMtools, Bioinformatics, vol.25, pp.2078-2079, 2009.

C. Le, X. Ying, C. Shang-qian, X. , K. Ning et al., Fast mapping, error correction, and de novo assembly for singlemolecule sequencing reads, Nature Methods, vol.14, pp.1072-1074, 2017.

Y. Chen, J. C. René, L. Warren, and B. Inanç, NanoSim : Nanopore sequence read simulator based on statistical characterization, 2017.

Y. Xiao, P. Sriram, . Chockalingam, and A. Srinivas, A survey of errorcorrection methods for next-generation sequencing, Briefings in Bioinformatics, vol.14, pp.56-66, 2013.

Y. E. Chengxi, ;. Zhanshan, ). Sam, and . Ma, Sparc : a sparsity-based consensus algorithm for long erroneous sequencing reads, PeerJ, vol.4, 2016.