Assessing Statistical Significance of Overrepresented Oligonucleotides

Alain Denise; Mireille Regnier; Mathias Vandenbogaert

Rapport (Rapport De Recherche) Année : 2001

Assessing Statistical Significance of Overrepresented Oligonucleotides

, (1) ,

Alain Denise

Fonction : Auteur
PersonId : 7014
IdHAL : alain-denise
ORCID : 0000-0003-4484-4996
IdRef : 035353368

Mireille Regnier

Fonction : Auteur
PersonId : 833582

Algorithms

Mathias Vandenbogaert

Fonction : Auteur

Résumé

Assessing statistical significance of overrepresentation of exceptional words is becoming an important task in computational biology. We show on two problems how large deviation methodology applies. First, when some oligomer $\path$ occurs more often than expected, e.g. may be overrepresented, large deviations allow for a very efficient computation of the so-called $p$-value. The second problem we address is the possible changes in the oligomers distribution induced by the overrepresentation of some pattern. Discarding this noise allows for the detection of weaker signals. Related algorithmic and complexity issues are discussed and compared to previous results. The approach is illustrated with two typical examples of applications on biological data.

Domaines

Autre [cs.OH]

Fichier principal

RR-4132.pdf (325.28 Ko)

Rapport De Recherche Inria : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00072496

Soumis le : mercredi 24 mai 2006-10:07:46

Dernière modification le : mardi 7 février 2023-03:39:24

Archivage à long terme le : mardi 22 février 2011-12:06:06

Dates et versions

inria-00072496 , version 1 (24-05-2006)

Identifiants

HAL Id : inria-00072496 , version 1

Citer

Alain Denise, Mireille Regnier, Mathias Vandenbogaert. Assessing Statistical Significance of Overrepresented Oligonucleotides. [Research Report] RR-4132, INRIA. 2001. ⟨inria-00072496⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INRIA INRIA-RRRT INRIA2 LARA

51 Consultations

145 Téléchargements

Assessing Statistical Significance of Overrepresented Oligonucleotides

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager