HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Journal articles

Reconsidering the significance of genomic word frequencies.

Abstract : By conventional wisdom, a feature that occurs too often or too rarely in a genome can indicate a functional element. To infer functionality from frequency, it is crucial to precisely characterize occurrences in randomly evolving DNA. We find that the frequency of oligonucleotides in a genomic sequence follows primarily a Pareto-lognormal distribution, which encapsulates lognormal and power-law features found across all known genomes. Such a distribution could be the result of completely random evolution by a copying process. Our characterization of the entire frequency distribution of genomic words opens a way to a more accurate reasoning about their over- and underrepresentation in genomic sequences.
Complete list of metadata

Contributor : Laurent Noé Connect in order to contact the contributor
Submitted on : Tuesday, January 26, 2010 - 9:49:20 AM
Last modification on : Friday, February 4, 2022 - 3:12:36 AM

Links full text



Miklós Csűrös, Laurent Noé, Gregory Kucherov. Reconsidering the significance of genomic word frequencies.. Trends in Genetics, Elsevier, 2007, 23 (11), pp.543-6. ⟨10.1016/j.tig.2007.07.008⟩. ⟨inria-00448737⟩



Record views