Skip to Main content Skip to Navigation
Journal articles

Rare Events and Conditional Events on Random Strings

Abstract : Some strings -the texts- are assumed to be randomly generated, according to a probability model that is either a Bernoulli model or a Markov model. A rare event is the over or under-representation of a word or a set of words. The aim of this paper is twofold. First, a single word is given. One studies the tail distribution of the number of its occurrences. Sharp large deviation estimates are derived. Second, one assumes that a given word is overrepresented. The distribution of a second word is studied; formulae for the expectation and the variance are derived. In both cases, the formulae are accurate and actually computable. These results have applications in computational biology, where a genome is viewed as a text.
Document type :
Journal articles
Complete list of metadata
Contributor : Service Ist Inria Sophia Antipolis-Méditerranée / I3s Connect in order to contact the contributor
Submitted on : Thursday, March 13, 2014 - 5:05:07 PM
Last modification on : Friday, January 21, 2022 - 3:14:56 AM
Long-term archiving on: : Friday, June 13, 2014 - 12:10:54 PM


Files produced by the author(s)


  • HAL Id : hal-00959004, version 1



Mireille Régnier, Alain Denise. Rare Events and Conditional Events on Random Strings. Discrete Mathematics and Theoretical Computer Science, DMTCS, 2004, 6 (2), pp.191-214. ⟨hal-00959004⟩



Les métriques sont temporairement indisponibles