G. Achaz, F. Boyer, E. Rocha, A. Viari, and E. Coissac, Repseek, a tool to retrieve approximate repeats from large DNA sequences, Bioinformatics, vol.23, issue.1, pp.119-121, 2007.
DOI : 10.1093/bioinformatics/btl519

A. Apostolico, M. Comin, and L. Parida, VARUN: Discovering Extensible Motifs under Saturation Constraints, IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.7, issue.4, pp.752-762, 2010.
DOI : 10.1109/TCBB.2008.123

M. Brudno, C. B. Do, G. M. Cooper, M. F. Kim, E. Davydov et al., LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale Multiple Alignment of Genomic DNA, Genome Research, vol.13, issue.4, pp.721-731, 2003.
DOI : 10.1101/gr.926603

G. Battaglia, R. Grossi, N. Pisanti, R. Marangoni, and G. Menconi, Inferring mobile elements in S.cerevisiae strains, Proceedings of International Conference on Bioinformatics Models, pp.131-136, 2011.

E. Birmelé, P. Crescenzi, R. A. Ferreira, R. Grossi, V. Lacroix et al., Efficient Bubble Enumeration in Directed Graphs, Proceedings of 19th Symposium on String Processing and Information Retrieval, pp.118-129
DOI : 10.1007/978-3-642-34109-0_13

C. Bron and J. Kerbosch, Algorithm 457: finding all cliques of an undirected graph, Communications of the ACM, vol.16, issue.9, pp.575-577, 1973.
DOI : 10.1145/362342.362367

S. Burkhardt, A. Crauser, P. Ferragina, H. Lenhof, E. Rivals et al., Vingron, q-gram based database searching using a suffix array (QUASAR), ACM Conference on Research in Computational Molecular Biology, pp.77-83, 1999.
DOI : 10.1145/299432.299460

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

R. C. Edgar, MUSCLE: multiple sequence alignment with high accuracy and high throughput, Nucleic Acids Research, vol.32, issue.5, pp.1792-1797, 2004.
DOI : 10.1093/nar/gkh340

URL : http://www.ncbi.nlm.nih.gov/pmc/articles/PMC390337

M. Federico, P. Peterlongo, and N. Pisanti, An optimized filter for finding multiple repeats in DNA sequences, ACS/IEEE International Conference on Computer Systems and Applications, AICCSA 2010, pp.1-8, 2010.
DOI : 10.1109/AICCSA.2010.5587026

URL : https://hal.archives-ouvertes.fr/inria-00480001

M. Federico, P. Peterlongo, N. Pisanti, and M. Sagot, Finding long and multiple repetitions with edit distance, Proceedings of the Prague Stringology Conference, 2011.

R. Grossi, A. Pietracaprina, N. Pisanti, G. Pucci, E. Upfal et al., MADMX: A Strategy for Maximal Dense Motif Extraction, Journal of Computational Biology, vol.18, issue.4, pp.535-545, 2011.
DOI : 10.1089/cmb.2010.0177

C. S. Iliopoulos, J. A. Mchugh, P. Peterlongo, N. Pisanti, W. Rytter et al., A FIRST APPROACH TO FINDING COMMON MOTIFS WITH GAPS, International Journal of Foundations of Computer Science, vol.11, issue.06, pp.1145-1154, 2005.
DOI : 10.1142/S0129054105003716

URL : https://hal.archives-ouvertes.fr/hal-00620030

J. Jones and M. Gellert, The taming of a transposon: V(D)J recombination and the immune system, Immunological Reviews, vol.91, issue.1, pp.233-248, 2004.
DOI : 10.1146/annurev.cellbio.15.1.435

I. Koch, Enumerating all connected maximal common subgraphs in two graphs, Theoretical Computer Science, vol.250, issue.1-2, pp.1-30, 2001.
DOI : 10.1016/S0304-3975(00)00286-3

URL : http://doi.org/10.1016/s0304-3975(00)00286-3

T. Lassmann, O. Frings, and E. Sonnhammer, Kalign2: high-performance multiple alignment of protein and nucleotide sequences allowing external features, Nucleic Acids Research, vol.37, issue.3, pp.858-865, 2009.
DOI : 10.1093/nar/gkn1006

T. Lassmann and E. L. Sonnhammer, Kalign ? an accurate and fast multiple sequence alignment algorithm, BMC Bioinformatics, vol.6, pp.1-20, 2005.
DOI : 10.1093/nar/gkl191

URL : http://doi.org/10.1093/nar/gkl191

D. J. Lipman, S. F. Altschul, and J. D. Kececioglu, A tool for multiple sequence alignment., Proc. Nat. Acad. Sci, pp.4412-4415, 1989.
DOI : 10.1073/pnas.86.12.4412

URL : https://www.ncbi.nlm.nih.gov/pmc/articles/PMC287279/pdf

G. Liti, D. M. Carter, and A. M. Moses, Population genomics of domestic and wild yeasts, Nature, vol.26, issue.7236, pp.337-341, 2009.
DOI : 10.1099/00207713-50-5-1931

URL : http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2659681

L. Marsan and M. Sagot, Algorithms for Extracting Structured Motifs Using a Suffix Tree with an Application to Promoter and Regulatory Site Consensus Identification, Journal of Computational Biology, vol.7, issue.3-4, pp.3-4, 2000.
DOI : 10.1089/106652700750050826

C. Notredame, D. G. Higgins, and J. Heringa, T-coffee: a novel method for fast and accurate multiple sequence alignment 1 1Edited by J. Thornton, Journal of Molecular Biology, vol.302, issue.1, pp.205-217, 2000.
DOI : 10.1006/jmbi.2000.4042

V. Pereira, D. Enard, and A. Eyre-walker, The Effect of Transposable Element Insertions on Gene Expression Evolution in Rodents, PLoS ONE, vol.35, issue.2, p.4321, 2009.
DOI : 10.1371/journal.pone.0004321.s004

P. Peterlongo, N. Pisanti, F. Boyer, A. P. Do-lago, and M. Sagot, Lossless filter for multiple repetitions with Hamming distance, Journal of Discrete Algorithms, vol.6, issue.3, pp.497-509, 2008.
DOI : 10.1016/j.jda.2007.03.003

URL : https://hal.archives-ouvertes.fr/inria-00179731

P. Peterlongo, N. Pisanti, F. Boyer, and M. Sagot, Lossless Filter for Finding Long Multiple Approximate Repetitions Using a New Data Structure, the Bi-factor Array, Proceedings of 12th Symposium on String Processing and Information Retrieval, pp.179-190, 2005.
DOI : 10.1007/11575832_20

URL : https://hal.archives-ouvertes.fr/inria-00328129

P. Peterlongo, G. T. Sacomoto, A. P. Do-lago, N. Pisanti, and M. Sagot, Lossless filter for multiple repeats with bounded edit distance, Algorithms for Molecular Biology, vol.4, issue.1, pp.1-20, 2009.
DOI : 10.1186/1748-7188-4-3

URL : https://hal.archives-ouvertes.fr/hal-00784457

P. Peterlongo, N. Schnel, N. Pisanti, M. Sagot, and V. Lacroix, Identifying SNPs without a Reference Genome by Comparing Raw Reads, Proceedings of 17th Symposium on String Processing and Information Retrieval, pp.147-158, 2010.
DOI : 10.1007/978-3-642-16321-0_14

URL : https://hal.archives-ouvertes.fr/inria-00514887

N. Pisanti, M. Giraud, and P. Peterlongo, Filters and Seeds Approaches for Fast Homology Searches in Large Datasets, Algorithms in Computational Molecular Biology, pp.299-320, 2010.
DOI : 10.1016/0304-3975(92)90143-4

URL : https://hal.archives-ouvertes.fr/inria-00425370

K. Rasmussen, J. Stoye, and E. Myers, -Matches over a Given Length, Journal of Computational Biology, vol.13, issue.2, pp.296-308, 2006.
DOI : 10.1089/cmb.2006.13.296

URL : https://hal.archives-ouvertes.fr/hal-00307305

S. E. Rombo, Extracting string motif bases for quorum higher than two, Theoretical Computer Science, vol.460, pp.94-103, 2012.
DOI : 10.1016/j.tcs.2012.06.021

A. Subramanian, M. Kauffmann, and B. Morgenstern, DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment, Algorithms for Molecular Biology, vol.3, issue.1, 2008.
DOI : 10.1186/1748-7188-3-6

URL : http://doi.org/10.1186/1748-7188-3-6

J. D. Thompson, D. G. Higgins, and T. J. Gibson, CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice, Nucleic Acids Research, vol.22, issue.22, pp.4673-4680, 1994.
DOI : 10.1093/nar/22.22.4673

T. Treangen and S. Salzberg, Repetitive DNA and next-generation sequencing: computational challenges and solutions, Nature Reviews Genetics, vol.18, pp.36-46, 2012.
DOI : 10.1093/dnares/dsq028

URL : https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3324860/pdf

Z. Xu and H. Wang, LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons, Nucleic Acids Research, vol.35, issue.Web Server, pp.265-268, 2007.
DOI : 10.1093/nar/gkm286