An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure.

Jérôme Waldispühl 1 Yann Ponty 2, 3
3 AMIB - Algorithms and Models for Integrative Biology
CNRS - Centre National de la Recherche Scientifique : UMR8623, X - École polytechnique, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique, LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau]
Abstract : The analysis of the relationship between sequences and structures (i.e., how mutations affect structures and reciprocally how structures influence mutations) is essential to decipher the principles driving molecular evolution, to infer the origins of genetic diseases, and to develop bioengineering applications such as the design of artificial molecules. Because their structures can be predicted from the sequence data only, RNA molecules provide a good framework to study this sequence-structure relationship. We recently introduced a suite of algorithms called RNAmutants which allows a complete exploration of RNA sequence-structure maps in polynomial time and space. Formally, RNAmutants takes an input sequence (or seed) to compute the Boltzmann-weighted ensembles of mutants with exactly k mutations, and sample mutations from these ensembles. However, this approach suffers from major limitations. Indeed, since the Boltzmann probabilities of the mutations depend of the free energy of the structures, RNAmutants has difficulties to sample mutant sequences with low G+C-contents. In this article, we introduce an unbiased adaptive sampling algorithm that enables RNAmutants to sample regions of the mutational landscape poorly covered by classical algorithms. We applied these methods to sample mutations with low G+C-contents. These adaptive sampling techniques can be easily adapted to explore other regions of the sequence and structural landscapes which are difficult to sample. Importantly, these algorithms come at a minimal computational cost. We demonstrate the insights offered by these techniques on studies of complete RNA sequence structures maps of sizes up to 40 nucleotides. Our results indicate that the G+C-content has a strong influence on the size and shape of the evolutionary accessible sequence and structural spaces. In particular, we show that low G+C-contents favor the apparition of internal loops and thus possibly the synthesis of tertiary structure motifs. On the other hand, high G+C-contents significantly reduce the size of the evolutionary accessible mutational landscapes.
Type de document :
Article dans une revue
Journal of Computational Biology, Mary Ann Liebert, 2011, 18 (11), pp.1465-79. 〈10.1089/cmb.2011.0181〉
Liste complète des métadonnées

https://hal.inria.fr/hal-00681928
Contributeur : Yann Ponty <>
Soumis le : jeudi 22 mars 2012 - 19:59:28
Dernière modification le : jeudi 12 avril 2018 - 01:48:41

Identifiants

Collections

Citation

Jérôme Waldispühl, Yann Ponty. An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure.. Journal of Computational Biology, Mary Ann Liebert, 2011, 18 (11), pp.1465-79. 〈10.1089/cmb.2011.0181〉. 〈hal-00681928〉

Partager

Métriques

Consultations de la notice

296