Skip to Main content Skip to Navigation
Journal articles

APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data

Simon Leonard 1, 2 Sam Meyer 1, 2 Stéphan Lacour 3 William Nasser 2, 1 Florence Hommais 1, 2 Sylvie Reverchon 2, 1
2 CRP - Chromatine et Régulation de la Pathogénie bactérienne
MAP - Microbiologie, adaptation et pathogénie
3 IBIS - Modeling, simulation, measurement, and control of bacterial regulatory networks
LAPM - Laboratoire Adaptation et pathogénie des micro-organismes [Grenoble], Inria Grenoble - Rhône-Alpes, Institut Jean Roget
Abstract : Small non-coding RNAs (sRNAs) regulate numerous cellular processes in all domains of life. Several approaches have been developed to identify them from RNA-seq data, which are efficient for eukaryotic sRNAs but remain inaccurate for the longer and highly structured bacterial sRNAs. We present APERO, a new algorithm to detect small transcripts from paired-end bacterial RNA-seq data. In contrast to previous approaches that start from the read coverage distribution, APERO analyzes boundaries of individual sequenced fragments to infer the 5' and 3' ends of all transcripts. Since sRNAs are about the same size as individual fragments (50-350 nucleotides), this algorithm provides a significantly higher accuracy and robustness, e.g., with respect to spontaneous internal breaking sites. To demonstrate this improvement, we develop a comparative assessment on datasets from Escherichia coli and Salmonella enterica, based on experimentally validated sRNAs. We also identify the small transcript repertoire of Dickeya dadantii including putative intergenic RNAs, 5' UTR or 3' UTR-derived RNA products and antisense RNAs. Comparisons to annotations as well as RACE-PCR experimental data confirm the precision of the detected transcripts. Altogether, APERO outperforms all existing methods in terms of sRNA detection and boundary precision, which is crucial for comprehensive genome annotations. It is freely available as an open source R package on https://github.com/Simon-Leonard/APERO.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-02152104
Contributor : Agnès Rodrigue <>
Submitted on : Tuesday, June 11, 2019 - 10:11:55 AM
Last modification on : Friday, March 6, 2020 - 1:44:43 AM

Links full text

Identifiers

Citation

Simon Leonard, Sam Meyer, Stéphan Lacour, William Nasser, Florence Hommais, et al.. APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data. Nucleic Acids Research, Oxford University Press, 2019, 47 (15), pp.1-12. ⟨10.1093/nar/gkz485⟩. ⟨hal-02152104⟩

Share

Metrics

Record views

73