Skip to Main content Skip to Navigation
Conference papers

How to measure the topological quality of protein parse trees?

Mateusz Pyzik 1 François Coste 2 Witold Dyrka 1
2 Dyliss - Dynamics, Logics and Inference for biological Systems and Sequences
Inria Rennes – Bretagne Atlantique , IRISA-D7 - GESTION DES DONNÉES ET DE LA CONNAISSANCE
Abstract : Human readability and, consequently, interpretability is often considered a key advantage of grammatical descriptors. Beyond the natural language, this is also true in analyzing biological sequences of RNA, typically modeled by grammars of at least context-free level of expressiveness. However, in protein sequence analysis, the explanatory power of grammatical descriptors beyond regular has never been thoroughly assessed. Since the biological meaning of a protein molecule is directly related to its spatial structure, it is justified to expect that the parse tree of a protein sequence reflects the spatial structure of the protein. In this piece of research, we propose and assess quantitative measures for comparing topology of the parse tree of a context-free grammar with topology of the protein structure succinctly represented by a contact map. Our results are potentially interesting beyond its bioinformatic context wherever a reference matrix of dependencies between sequence constituents is available.
Complete list of metadatas

Cited literature [43 references]  Display  Hide  Download

https://hal.inria.fr/hal-01938608
Contributor : François Coste <>
Submitted on : Wednesday, November 28, 2018 - 6:04:03 PM
Last modification on : Saturday, July 11, 2020 - 3:15:08 AM

File

pyzik18.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01938608, version 1

Citation

Mateusz Pyzik, François Coste, Witold Dyrka. How to measure the topological quality of protein parse trees?. ICGI 2018 - 14th International Conference on Grammatical Inference, Sep 2018, Wroclaw, Poland. pp.118 - 138. ⟨hal-01938608⟩

Share

Metrics

Record views

78

Files downloads

62