Efficient repeat finding in sets of strings via suffix arrays

Abstract : We consider two repeat finding problems relative to sets of strings: (a) Find the largest substrings that occur in every string of a given set; (b) Find the maximal repeats in a given string that occur in no string of a given set. Our solutions are based on the suffix array construction, requiring O(m) memory, where m is the length of the longest input string, and O(n &log;m) time, where n is the the whole input size (the sum of the length of each string in the input). The most expensive part of our algorithms is the computation of several suffix arrays. We give an implementation and experimental results that evidence the efficiency of our algorithms in practice, even for very large inputs.
Type de document :
Article dans une revue
Discrete Mathematics and Theoretical Computer Science, DMTCS, 2013, Vol. 15 no. 2 (2), pp.59--70
Liste complète des métadonnées

Littérature citée [9 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00980753
Contributeur : Service Ist Inria Sophia Antipolis-Méditerranée / I3s <>
Soumis le : vendredi 18 avril 2014 - 16:43:31
Dernière modification le : mercredi 1 août 2018 - 15:02:03
Document(s) archivé(s) le : lundi 10 avril 2017 - 15:39:24

Fichier

2130-7867-1-PB.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00980753, version 1

Collections

Citation

Pablo Barenbaum, Verónica Becher, Alejandro Deymonnaz, Melisa Halsband, Pablo Ariel Heiber. Efficient repeat finding in sets of strings via suffix arrays. Discrete Mathematics and Theoretical Computer Science, DMTCS, 2013, Vol. 15 no. 2 (2), pp.59--70. 〈hal-00980753〉

Partager

Métriques

Consultations de la notice

772

Téléchargements de fichiers

303