All-pairs suffix/prefix in optimal time using Aho-Corasick space - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Article Dans Une Revue Information Processing Letters Année : 2022

All-pairs suffix/prefix in optimal time using Aho-Corasick space

Résumé

The all-pairs suffix/prefix (APSP) problem is a classic problem in computer science with many applications in bioinformatics. Given a set {S 1 ,. .. , S k } of k strings of total length n, we are asked to find, for each string S i , i ∈ [1, k], its longest suffix that is a prefix of string S j , for all j = i, j ∈ [1, k]. Several algorithms running in the optimal O(n + k 2) time for solving APSP are known. All of these algorithms are based on suffix sorting and thus require space (n) in any case. We consider the parameterized version of the APSP problem, denoted by-APSP, in which we are asked to output only the pairs whose suffix/prefix overlap is of length at least. We give an algorithm for solving-APSP that runs in the optimal O(n + |OUTPUT |) time using O(n) space, where OUTPUT is the set of output pairs. Our algorithm is thus optimal for the APSP problem as well by setting = 0. Notably, our algorithm is fundamentally different from all optimal algorithms solving the APSP problem: it does not rely on sorting the suffixes of all input strings but on a novel traversal of the Aho-Corasick machine, and it thus requires space linear in the size of the machine.
Fichier principal
Vignette du fichier
31637.pdf (468.12 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-03832860 , version 1 (28-10-2022)

Identifiants

Citer

Grigorios Loukides, Solon P Pissis. All-pairs suffix/prefix in optimal time using Aho-Corasick space. Information Processing Letters, 2022, 178, pp.106275. ⟨10.1016/j.ipl.2022.106275⟩. ⟨hal-03832860⟩
40 Consultations
53 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More