Skip to Main content Skip to Navigation
Conference papers

An EM algorithm for mapping short reads in multiple RNA structure probing experiments

Afaf Saaidi 1, 2 Yann Ponty 2, 1, 3 Mathieu Blanchette 4 Mireille Regnier 1, 2 Bruno Sargueil 5
2 AMIB - Algorithms and Models for Integrative Biology
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France
3 AMIBIO - Algorithms and Models for Integrative BIOlogy
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau]
Abstract : An accurate mapping of reads against the sequence of reference is the first step to grant a good NGS data analysis. However, when mapping is about assigning reads to a set of RNA variants, in the case of simultaneous sequencing, the task become hard to handle. Many algorithms have been developed to overcome the issue of mapping reads against a set of homologous sequences at one time but the problem is not fully resolved, particularly when dealing with short reads. The issue addressed in our study is much more challenging; In addition to the parallel assignment issue in the presence of short reads, RNA variants molecules, used for the library sequencing preparation step, undergo a specific experimental treatment SHAPE causing the formation of mutations at the level of structurally unpaired nucleotides. Mutations due to SHAPE might lead to a miss-mapping i.e. a read could be derived from a given RNA variant i and because of SHAPE mutations it becomes more appropriate to assign it to the variant j from which the read has the shortest base distance. In an ongoing work, we are trying to resolve the unprecedented mapping question trough an Expectation Maximization (EM) algorithm where each RNA variant from the set of references would be characterized by a SHAPE mutational profile instead of being merely characterized by a sequence of nucleotides. The EM algorithm aims to maximize the likelihood of a read to be derived from a specific RNA variant and to assess its contribution to build the RNA associated mutational profile.
Document type :
Conference papers
Complete list of metadata
Contributor : Afaf Saaidi Connect in order to contact the contributor
Submitted on : Tuesday, September 19, 2017 - 4:52:27 PM
Last modification on : Friday, January 21, 2022 - 3:10:02 AM


  • HAL Id : hal-01590528, version 1


Afaf Saaidi, Yann Ponty, Mathieu Blanchette, Mireille Regnier, Bruno Sargueil. An EM algorithm for mapping short reads in multiple RNA structure probing experiments. Matbio2017, King's College London Sep 2017, London, United Kingdom. ⟨hal-01590528⟩



Les métriques sont temporairement indisponibles