Large deviation properties for patterns - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Large deviation properties for patterns

Résumé

Deciding whether a given pattern is overrepresented or under-represented according to a given background model is a key question in computational biology. Such a decision is usually made by computing some p-values re ecting the \exceptionality" of a pattern in a given sequence or set of sequences. In the simplest cases (short and simple patterns, simple background model, small number of sequences), an exact p-value can be computed with a tractable complexity. The realistic cases are in general too complicated to get such an exact p-value. Approximations are thus proposed (Gaussian, Poisson, Large deviation approximations). These approximations are applicable under some conditions: Gaussian approximations are valid in the central domain while Poisson and Large deviations approximations are valid for rare events. In the present paper, we prove a large deviation approximation to the double strands counting problem that refers to a counting of a given pattern in a set of sequences that arise from both strands of the genome. Here dependencies between a sequence and its complement plays a fundamental role. General combinatorial properties of the pattern allow to deal with such a dependency. A large deviation result is also provided for a set of small sequences.
Ce papier presente des résultarts de grande déviation sur les mots. Le premier cas traité correspond au comptage de deux mots, ce qui couvre le cas important de la recherche d'un motif sur deux brins complémentaires de l'ADN, Le second cas est celui de la recherche d'un ensemble fini de mots dans un ensemble de séquences courtes, avec des probabilités d'apparition différentes.
Fichier principal
Vignette du fichier
template1611.pdf (274.73 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00758251 , version 1 (06-12-2012)

Identifiants

  • HAL Id : hal-00758251 , version 1

Citer

Jérémie Bourdon, Mireille Regnier. Large deviation properties for patterns. LSD&LAW 2012, Simon J. Puglisi and Golnaz Badkobeh, Feb 2012, Londres, United Kingdom. ⟨hal-00758251⟩
355 Consultations
224 Téléchargements

Partager

Gmail Facebook X LinkedIn More