Analysis of pattern overlaps and exact computation of P-values of pattern occurrences numbers: case of Hidden Markov Models

Abstract : We present a novel algorithm, SufPref, computing an exact pvalue for Hidden Markov models (HMM). The algorithm inductively traverses specific data structure, the overlap graph. Nodes of the graph are associated with the overlaps of words from a given set H. Edges are associated to the prefix and suffix relations between ovelaps. An originality of our data structure is that pattern H need not be explicitly represented in nodes or leaves. The algorithm relies on the Cartesian product of the overlap graph and the graph of HMM states; the approach is analogous to a weighted automaton approach. The gain in size of SufPref data structure leads to significant space and time complexity improvements. We suppose that all words in the pattern H are of the same length m. The algorithm SufPref was implemented as a C++ program; it can be used both as Web-server and a stand alone program for Linux and Windows.
Liste complète des métadonnées

Cited literature [37 references]  Display  Hide  Download

https://hal.inria.fr/hal-00858701
Contributor : Mireille Regnier <>
Submitted on : Thursday, September 5, 2013 - 5:46:05 PM
Last modification on : Wednesday, March 27, 2019 - 4:41:29 PM
Document(s) archivé(s) le : Thursday, April 6, 2017 - 4:00:35 PM

File

jwtemplate.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Mireille Régnier, Evgenia Furletova, Victor Yakovlev, Mikhail Roytberg. Analysis of pattern overlaps and exact computation of P-values of pattern occurrences numbers: case of Hidden Markov Models. Algorithms for Molecular Biology, BioMed Central, 2014, 9 (1), ⟨10.1186/s13015-014-0025-1⟩. ⟨hal-00858701⟩

Share

Metrics

Record views

603

Files downloads

262