Finding Long and Multiple Repeats with Edit Distance

Maria Federico 1 Pierre Peterlongo 2 Nadia Pisanti 3 Marie-France Sagot 4
2 SYMBIOSE - Biological systems and models, bioinformatics and sequences
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
4 BAMBOO - An algorithmic view on genomes, cells, and environments
Inria Grenoble - Rhône-Alpes, LBBE - Laboratoire de Biométrie et Biologie Evolutive - UMR 5558
Abstract : We present a tool for detecting long similar fragments that occur two or more times in a set of biological sequences. The problem has interesting applications in the analysis of biological sequences and their correlation, and becomes computationally challenging when a certain non negligible number of insertions, deletions and substitutions are allowed. For this reason exact exhaustive methods are hardly of practical use. In this paper we introduce a tool, FilmRed, that performs this task, and that manages instances whose size and parameters combination cannot be handled by any existing tool. This is achieved by using a filter as a preprocessing step, and by using the information that the filter has gathered also in the successive inference phase. To the best of our knowledge, FilmRed is the first ab initio tool that can deal with repeats occurring possibly several times, that have length of hundreds or thousands bases, and whose occurrences may differ in even more than 10% of their positions in terms of substitutions and indels.
Complete list of metadatas

https://hal.inria.fr/inria-00608208
Contributor : Pierre Peterlongo <>
Submitted on : Tuesday, July 12, 2011 - 2:27:13 PM
Last modification on : Thursday, March 21, 2019 - 2:51:28 PM

Identifiers

  • HAL Id : inria-00608208, version 1

Citation

Maria Federico, Pierre Peterlongo, Nadia Pisanti, Marie-France Sagot. Finding Long and Multiple Repeats with Edit Distance. The Prague Stringology Conference 2011, Aug 2011, Prague, Czech Republic. ⟨inria-00608208⟩

Share

Metrics

Record views

283