Skip to Main content Skip to Navigation
Conference papers

Finding Long and Multiple Repeats with Edit Distance

Maria Federico 1 Pierre Peterlongo 2 Nadia Pisanti 3 Marie-France Sagot 4 
2 SYMBIOSE - Biological systems and models, bioinformatics and sequences
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
4 BAMBOO - An algorithmic view on genomes, cells, and environments
Inria Grenoble - Rhône-Alpes, LBBE - Laboratoire de Biométrie et Biologie Evolutive - UMR 5558
Abstract : We present a tool for detecting long similar fragments that occur two or more times in a set of biological sequences. The problem has interesting applications in the analysis of biological sequences and their correlation, and becomes computationally challenging when a certain non negligible number of insertions, deletions and substitutions are allowed. For this reason exact exhaustive methods are hardly of practical use. In this paper we introduce a tool, FilmRed, that performs this task, and that manages instances whose size and parameters combination cannot be handled by any existing tool. This is achieved by using a filter as a preprocessing step, and by using the information that the filter has gathered also in the successive inference phase. To the best of our knowledge, FilmRed is the first ab initio tool that can deal with repeats occurring possibly several times, that have length of hundreds or thousands bases, and whose occurrences may differ in even more than 10% of their positions in terms of substitutions and indels.
Complete list of metadata
Contributor : Pierre Peterlongo Connect in order to contact the contributor
Submitted on : Tuesday, July 12, 2011 - 2:27:13 PM
Last modification on : Wednesday, February 2, 2022 - 3:54:54 PM


  • HAL Id : inria-00608208, version 1


Maria Federico, Pierre Peterlongo, Nadia Pisanti, Marie-France Sagot. Finding Long and Multiple Repeats with Edit Distance. The Prague Stringology Conference 2011, Aug 2011, Prague, Czech Republic. ⟨inria-00608208⟩



Record views