Skip to Main content Skip to Navigation
Conference papers

Exploiting Complex Protein Domain Networks for Protein Function Annotation

Bishnu Sarker 1, 2 David Ritchie 1 Sabeur Aridhi 2, 1
1 CAPSID - Computational Algorithms for Protein Structures and Interactions
Inria Nancy - Grand Est, LORIA - AIS - Department of Complex Systems, Artificial Intelligence & Robotics
Abstract : Huge numbers of protein sequences are now available in public databases. In order to exploit more fully this valuable biological data, these sequences need to be annotated with functional properties such as Enzyme Commission (EC) numbers and Gene Ontology terms. The UniProt Knowledgebase (UniProtKB) is currently the largest and most comprehensive resource for protein sequence and annotation data. In the March 2018 release of UniProtKB, some 556,000 sequences have been manually curated but over 111 million sequences still lack functional annotations. The ability to annotate automatically these unannotated sequences would represent a major advance for the field of bioinformatics. Here, we present a novel network-based approach called GrAPFI for the automatic functional annotation of protein sequences. The underlying assumption of GrAPFI is that proteins may be related to each other by the protein domains, families, and super-families that they share. Several protein domain databases exist such as In-terPro, Pfam, SMART, CDD, Gene3D, and Prosite, for example. Our approach uses Interpro domains, because the InterPro database contains information from several other major protein family and domain databases. Our results show that GrAPFI achieves better EC number annotation performance than several other previously described approaches.
Complete list of metadata

Cited literature [34 references]  Display  Hide  Download
Contributor : Bishnu Sarker Connect in order to contact the contributor
Submitted on : Tuesday, November 13, 2018 - 1:23:31 PM
Last modification on : Wednesday, November 3, 2021 - 7:09:20 AM
Long-term archiving on: : Thursday, February 14, 2019 - 2:16:17 PM


Files produced by the author(s)


  • HAL Id : hal-01920595, version 1



Bishnu Sarker, David Ritchie, Sabeur Aridhi. Exploiting Complex Protein Domain Networks for Protein Function Annotation. Complex Networks 2018 - 7th International Conference on Complex Networks and Their Applications, Dec 2018, Cambridge, United Kingdom. ⟨hal-01920595⟩



Les métriques sont temporairement indisponibles