Skip to Main content Skip to Navigation
New interface
Journal articles

Large deviation properties for patterns

Mireille Regnier 1, 2 Jérémie Bourdon 3 
1 AMIB - Algorithms and Models for Integrative Biology
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France
Abstract : Deciding whether a given pattern is over- or under-represented according to a given background model is a key question in computational biology. Such a decision is usually made by computing some p-values reflecting the ''exceptionality'' of a pattern in a given sequence or set of sequences. In the simplest cases (short and simple patterns, simple background model, small number of sequences), an exact p-value can be computed with a tractable complexity. The realistic cases are in general too complicated to get such an exact $p$-value. Approximations are thus proposed (Gaussian, Poisson, Large deviation approximations). These approximations are applicable under some conditions: Gaussian approximations are valid in the central domain while Poisson and Large deviation approximations are valid for rare events. In the present paper, we prove a large deviation approximation to the double strands counting problem that refers to a counting of a given pattern in a set of sequences that arise from both strands of the genome. In that case, dependencies between a sequence and its reverse complement cannot be neglected. They are captured here for a Bernoulli model from general combinatorial properties of the pattern. A large deviation result is also provided for a set of small sequences.
Complete list of metadata
Contributor : Mireille Regnier Connect in order to contact the contributor
Submitted on : Tuesday, October 1, 2013 - 2:54:27 PM
Last modification on : Sunday, June 26, 2022 - 11:59:47 AM

Links full text



Mireille Regnier, Jérémie Bourdon. Large deviation properties for patterns. Journal of Discrete Algorithms, 2013, ⟨10.1016/j.jda.2013.09.004⟩. ⟨hal-00868462⟩



Record views