How to measure the topological quality of protein parse trees? - Archive ouverte HAL Access content directly
Conference Papers Year : 2018

How to measure the topological quality of protein parse trees?

(1) , (2) , (1)


Human readability and, consequently, interpretability is often considered a key advantage of grammatical descriptors. Beyond the natural language, this is also true in analyzing biological sequences of RNA, typically modeled by grammars of at least context-free level of expressiveness. However, in protein sequence analysis, the explanatory power of grammatical descriptors beyond regular has never been thoroughly assessed. Since the biological meaning of a protein molecule is directly related to its spatial structure, it is justified to expect that the parse tree of a protein sequence reflects the spatial structure of the protein. In this piece of research, we propose and assess quantitative measures for comparing topology of the parse tree of a context-free grammar with topology of the protein structure succinctly represented by a contact map. Our results are potentially interesting beyond its bioinformatic context wherever a reference matrix of dependencies between sequence constituents is available.
Fichier principal
Vignette du fichier
pyzik18.pdf (1.7 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-01938608 , version 1 (28-11-2018)


  • HAL Id : hal-01938608 , version 1


Mateusz Pyzik, François Coste, Witold Dyrka. How to measure the topological quality of protein parse trees?. ICGI 2018 - 14th International Conference on Grammatical Inference, Sep 2018, Wroclaw, Poland. pp.118 - 138. ⟨hal-01938608⟩
55 View
50 Download


Gmail Facebook Twitter LinkedIn More