Large deviation properties for patterns

Jérémie Bourdon 1 Mireille Regnier 2, 3, *
* Corresponding author
3 AMIB - Algorithms and Models for Integrative Biology
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France
Abstract : Deciding whether a given pattern is overrepresented or under-represented according to a given background model is a key question in computational biology. Such a decision is usually made by computing some p-values re ecting the \exceptionality" of a pattern in a given sequence or set of sequences. In the simplest cases (short and simple patterns, simple background model, small number of sequences), an exact p-value can be computed with a tractable complexity. The realistic cases are in general too complicated to get such an exact p-value. Approximations are thus proposed (Gaussian, Poisson, Large deviation approximations). These approximations are applicable under some conditions: Gaussian approximations are valid in the central domain while Poisson and Large deviations approximations are valid for rare events. In the present paper, we prove a large deviation approximation to the double strands counting problem that refers to a counting of a given pattern in a set of sequences that arise from both strands of the genome. Here dependencies between a sequence and its complement plays a fundamental role. General combinatorial properties of the pattern allow to deal with such a dependency. A large deviation result is also provided for a set of small sequences.
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download

https://hal.inria.fr/hal-00758251
Contributor : Mireille Regnier <>
Submitted on : Thursday, December 6, 2012 - 2:49:37 PM
Last modification on : Wednesday, March 27, 2019 - 4:41:29 PM
Long-term archiving on : Saturday, December 17, 2016 - 4:17:47 PM

File

template1611.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00758251, version 1

Citation

Jérémie Bourdon, Mireille Regnier. Large deviation properties for patterns. LSD&LAW 2012, Simon J. Puglisi and Golnaz Badkobeh, Feb 2012, Londres, United Kingdom. ⟨hal-00758251⟩

Share

Metrics

Record views

708

Files downloads

215