Service interruption on Monday 11 July from 12:30 to 13:00: all the sites of the CCSD (HAL, EpiSciences, SciencesConf, AureHAL) will be inaccessible (network hardware connection).
Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Minimized Compact Automaton for Clumps over Degenerate Patterns

Abstract : Clumps are sequences of overlapping occurrences of a given pattern that play a vital role in the study of distribution of pattern occurrences. These distributions are sused for finding functional fragments in biological sequences. In this paper we present a minimized compacted automaton (Overlap walking automaton, OWA) recognizing all the possible clumps for degenerate patterns and its usage for computation of probabilities of sets of clumps. We also present Aho-Corasick like automaton, RMinPatAut, recognizing all the sequences ending with pattern occurrences. The states of RMinPatAut are equivalence classes on the prefixes of the pattern words. We have proved that RMinPatAut is Nerode-minimal, i.e., minimal in classical sense. We use RMinPatAut as an auxiliary structure for OWA construction. For degenerate pattern, RMinPatAut can be constructed in linear time on the number of its states (it is bounded by 2m, where m the length of pattern words). OWA can be constructed in linear time on the sum of its size and RMinPatAut size.
Complete list of metadata

Cited literature [20 references]  Display  Hide  Download
Contributor : Mireille Regnier Connect in order to contact the contributor
Submitted on : Thursday, January 24, 2019 - 6:56:04 PM
Last modification on : Friday, February 4, 2022 - 3:09:32 AM


OWAv5 (3).pdf
Files produced by the author(s)


  • HAL Id : hal-01940837, version 2


Eugenia Furletova, Jan Holub, Mireille Regnier. Minimized Compact Automaton for Clumps over Degenerate Patterns. 2019. ⟨hal-01940837v2⟩



Record views


Files downloads