The BRaliBase dent - a tale of benchmark design and interpretation.

Benedikt Löwes 1 Cedric Chauve 2 Yann Ponty 3, 4 Robert Giegerich 1
3 AMIB - Algorithms and Models for Integrative Biology
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France
Abstract : BRaliBase is a widely used benchmark for assessing the accuracy of RNA secondary structure alignment methods. In most case studies based on the BRaliBase benchmark, one can observe a puzzling drop in accuracy in the 40-60% sequence identity range, the so-called 'BRaliBase Dent'. In this article, we show this dent is owing to a bias in the composition of the BRaliBase benchmark, namely the inclusion of a disproportionate number of transfer RNAs, which exhibit a conserved secondary structure. Our analysis, aside of its interest regarding the specific case of the BRaliBase benchmark, also raises important questions regarding the design and use of benchmarks in computational biology.
Benedikt Löwes, Cedric Chauve, Yann Ponty, Robert Giegerich. The BRaliBase dent - a tale of benchmark design and interpretation.. Briefings in Bioinformatics, Oxford University Press (OUP), 2017, 18 (2), pp.306--311.



