Assessing Statistical Significance of Overrepresented Oligonucleotides - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Rapport (Rapport De Recherche) Année : 2001

Assessing Statistical Significance of Overrepresented Oligonucleotides

Alain Denise
Mireille Regnier
  • Fonction : Auteur
  • PersonId : 833582
Mathias Vandenbogaert
  • Fonction : Auteur

Résumé

Assessing statistical significance of overrepresentation of exceptional words is becoming an important task in computational biology. We show on two problems how large deviation methodology applies. First, when some oligomer $\path$ occurs more often than expected, e.g. may be overrepresented, large deviations allow for a very efficient computation of the so-called $p$-value. The second problem we address is the possible changes in the oligomers distribution induced by the overrepresentation of some pattern. Discarding this noise allows for the detection of weaker signals. Related algorithmic and complexity issues are discussed and compared to previous results. The approach is illustrated with two typical examples of applications on biological data.

Domaines

Autre [cs.OH]
Fichier principal
Vignette du fichier
RR-4132.pdf (325.28 Ko) Télécharger le fichier

Dates et versions

inria-00072496 , version 1 (24-05-2006)

Identifiants

  • HAL Id : inria-00072496 , version 1

Citer

Alain Denise, Mireille Regnier, Mathias Vandenbogaert. Assessing Statistical Significance of Overrepresented Oligonucleotides. [Research Report] RR-4132, INRIA. 2001. ⟨inria-00072496⟩
51 Consultations
144 Téléchargements

Partager

Gmail Facebook X LinkedIn More