Expected distance between terminal nucleotides of RNA secondary structures.

Peter Clote 1, 2, * Yann Ponty 2, 3 Jean-Marc Steyaert 2, 3
* Auteur correspondant
3 AMIB - Algorithms and Models for Integrative Biology
CNRS - Centre National de la Recherche Scientifique : UMR8623, Polytechnique - X, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique, LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau]
Abstract : In "The ends of a large RNA molecule are necessarily close", Yoffe et al. (Nucleic Acids Res 39(1):292-299, 2011) used the programs RNAfold [resp. RNAsubopt] from Vienna RNA Package to calculate the distance between 5' and 3' ends of the minimum free energy secondary structure [resp. thermal equilibrium structures] of viral and random RNA sequences. Here, the 5'-3' distance is defined to be the length of the shortest path from 5' node to 3' node in the undirected graph, whose edge set consists of edges {i, i + 1} corresponding to covalent backbone bonds and of edges {i, j} corresponding to canonical base pairs. From repeated simulations and using a heuristic theoretical argument, Yoffe et al. conclude that the 5'-3' distance is less than a fixed constant, independent of RNA sequence length. In this paper, we provide a rigorous, mathematical framework to study the expected distance from 5' to 3' ends of an RNA sequence. We present recurrence relations that precisely define the expected distance from 5' to 3' ends of an RNA sequence, both for the Turner nearest neighbor energy model, as well as for a simple homopolymer model first defined by Stein and Waterman. We implement dynamic programming algorithms to compute (rather than approximate by repeated application of Vienna RNA Package) the expected distance between 5' and 3' ends of a given RNA sequence, with respect to the Turner energy model. Using methods of analytical combinatorics, that depend on complex analysis, we prove that the asymptotic expected 5'-3' distance of length n homopolymers is approximately equal to the constant 5.47211, while the asymptotic distance is 6.771096 if hairpins have a minimum of 3 unpaired bases and the probability that any two positions can form a base pair is 1/4. Finally, we analyze the 5'-3' distance for secondary structures from the STRAND database, and conclude that the 5'-3' distance is correlated with RNA sequence length.
Type de document :
Article dans une revue
Journal of Mathematical Biology, Springer Verlag (Germany), 2012, 65 (3), pp.581-99. 〈10.1007/s00285-011-0467-8〉
Liste complète des métadonnées

Littérature citée [22 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00619921
Contributeur : Yann Ponty <>
Soumis le : lundi 12 septembre 2011 - 20:25:00
Dernière modification le : jeudi 11 janvier 2018 - 06:23:08
Document(s) archivé(s) le : mardi 13 décembre 2011 - 02:20:37

Fichiers

paper.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Peter Clote, Yann Ponty, Jean-Marc Steyaert. Expected distance between terminal nucleotides of RNA secondary structures.. Journal of Mathematical Biology, Springer Verlag (Germany), 2012, 65 (3), pp.581-99. 〈10.1007/s00285-011-0467-8〉. 〈inria-00619921〉

Partager

Métriques

Consultations de la notice

376

Téléchargements de fichiers

256