Finding Frequent Subsequences in a Set of Texts

Alban Mancheron 1, * Jean-Émile Symphor 2
* Corresponding author
1 SEQUOIA - Sequential Learning
LIFL - Laboratoire d'Informatique Fondamentale de Lille, Inria Lille - Nord Europe
Abstract : Given a set of strings, the Common Subsequence Automaton accepts all common subsequences of these strings. Such an automaton can be deduced from other automata like the Directed Acyclic Subsequence Graph or the Subsequence Automaton. In this paper, we introduce some new issues in text algorithm on the basis of Common Subsequences related problems. Firstly, we make an overview of different existing automata, focusing on their similarities and differences. Secondly, we present a new automaton, the Constrained Subsequence Automaton, which extends the Common Subsequence Automaton, by adding an integer $q$ denoted quorum.
Complete list of metadatas

Cited literature [8 references]  Display  Hide  Download

https://hal.inria.fr/inria-00257561
Contributor : Alban Mancheron <>
Submitted on : Tuesday, February 19, 2008 - 4:25:59 PM
Last modification on : Thursday, February 21, 2019 - 10:52:49 AM
Long-term archiving on : Thursday, May 20, 2010 - 10:50:38 PM

Files

CSAq.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00257561, version 1

Collections

Citation

Alban Mancheron, Jean-Émile Symphor. Finding Frequent Subsequences in a Set of Texts. [Research Report] 2007, pp.13. ⟨inria-00257561⟩

Share

Metrics

Record views

282

Files downloads

167