APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Article Dans Une Revue Nucleic Acids Research Année : 2019

APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data

Résumé

Small non-coding RNAs (sRNAs) regulate numerous cellular processes in all domains of life. Several approaches have been developed to identify them from RNA-seq data, which are efficient for eukaryotic sRNAs but remain inaccurate for the longer and highly structured bacterial sRNAs. We present APERO, a new algorithm to detect small transcripts from paired-end bacterial RNA-seq data. In contrast to previous approaches that start from the read coverage distribution, APERO analyzes boundaries of individual sequenced fragments to infer the 5' and 3' ends of all transcripts. Since sRNAs are about the same size as individual fragments (50-350 nucleotides), this algorithm provides a significantly higher accuracy and robustness, e.g., with respect to spontaneous internal breaking sites. To demonstrate this improvement, we develop a comparative assessment on datasets from Escherichia coli and Salmonella enterica, based on experimentally validated sRNAs. We also identify the small transcript repertoire of Dickeya dadantii including putative intergenic RNAs, 5' UTR or 3' UTR-derived RNA products and antisense RNAs. Comparisons to annotations as well as RACE-PCR experimental data confirm the precision of the detected transcripts. Altogether, APERO outperforms all existing methods in terms of sRNA detection and boundary precision, which is crucial for comprehensive genome annotations. It is freely available as an open source R package on https://github.com/Simon-Leonard/APERO.
Fichier principal
Vignette du fichier
APERO.pdf (1.5 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-02152104 , version 1 (06-11-2021)

Identifiants

Citer

Simon Leonard, Sam Meyer, Stéphan Lacour, William Nasser, Florence Hommais, et al.. APERO: a genome-wide approach for identifying bacterial small RNAs from RNA-Seq data. Nucleic Acids Research, 2019, 47 (15), pp.1-12. ⟨10.1093/nar/gkz485⟩. ⟨hal-02152104⟩
68 Consultations
34 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More