Skip to Main content Skip to Navigation
Journal articles

GrAPFI: predicting enzymatic function of proteins from domain similarity graphs

Bishnu Sarker 1 David Ritchie 1 Sabeur Aridhi 1
1 CAPSID - Computational Algorithms for Protein Structures and Interactions
Inria Nancy - Grand Est, LORIA - AIS - Department of Complex Systems, Artificial Intelligence & Robotics
Abstract : Background: Thanks to recent developments in genomic sequencing technologies, the number of protein sequences in public databases is growing enormously. To enrich and exploit this immensely valuable data, it is essential to annotate these sequences with functional properties such as Enzyme Commission (EC) numbers, for example. The January 2019 release of the Uniprot Knowledge base (UniprotKB) contains around 140 million protein sequences. However, only about half of a million of these (UniprotKB/SwissProt) have been reviewed and functionally annotated by expert curators using data extracted from the literature and computational analyses. To reduce the gap between the annotated and unannotated protein sequences, it is essential to develop accurate automatic protein function annotation techniques. Results: In this work, we present GrAPFI (Graph-based Automatic Protein Function Inference) for automatically annotating proteins with EC number functional descriptors from a protein domain similarity graph. We validated the performance of GrAPFI using six reference proteomes in UniprotKB/SwissProt, namely Human, Mouse, Rat, Yeast, E. Coli and Arabidopsis thaliana. We also compared GrAPFI with existing EC prediction approaches such as ECPred, DEEPre, and SVMProt. This shows that GrAPFI achieves better accuracy and comparable or better coverage with respect to these earlier approaches. Conclusions: GrAPFI is a novel protein function annotation tool that performs automatic inference on a network of proteins that are related according to their domain composition. Our evaluation of GrAPFI shows that it gives better performance than other state of the art methods. GrAPFI is available at https://gitlab.inria.fr/bsarker/bmc_grapfi.git as a stand alone tool written in Python.
Complete list of metadatas

https://hal.inria.fr/hal-03022601
Contributor : Sabeur Aridhi <>
Submitted on : Tuesday, November 24, 2020 - 8:06:55 PM
Last modification on : Tuesday, January 5, 2021 - 4:59:48 PM

File

Sarker et al BMC Bioinformatic...
Files produced by the author(s)

Identifiers

Collections

Citation

Bishnu Sarker, David Ritchie, Sabeur Aridhi. GrAPFI: predicting enzymatic function of proteins from domain similarity graphs. BMC Bioinformatics, BioMed Central, 2020, ⟨10.1186/s12859-020-3460-7⟩. ⟨hal-03022601⟩

Share

Metrics

Record views

21

Files downloads

66