Prediction of protein function using a deep convolutional neural network ensemble

Abstract : Background. The availability of large databases containing high resolution three-dimensional (3D) models of proteins in conjunction with functional annotation allows the exploitation of advanced supervised machine learning techniques for automatic protein function prediction. Methods. In this work, novel shape features are extracted representing protein structure in the form of local (per amino acid) distribution of angles and amino acid distances, respectively. Each of the multi-channel feature maps is introduced into a deep convolutional neural network (CNN) for function prediction and the outputs are fused through support vector machines or a correlation-based k-nearest neighbor classifier. Two different architectures are investigated employing either one CNN per multi-channel feature set, or one CNN per image channel. Results. Cross validation experiments on single-functional enzymes (n = 44,661) from the PDB database achieved 90.1% correct classification, demonstrating an improvement over previous results on the same dataset when sequence similarity was not considered. Discussion. The automatic prediction of protein function can provide quick annotations on extensive datasets opening the path for relevant applications, such as pharmacological target identification. The proposed method shows promise for structure-based protein function prediction, but sufficient data may not yet be available to properly assess the method's performance on non-homologous proteins and thus reduce the confounding factor of evolutionary relationships.
Type de document :
Article dans une revue
PeerJ Computer Science, PeerJ, 2017, 3, pp.1-17. 〈10.7717/peerj-cs.124〉
Liste complète des métadonnées

Littérature citée [33 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01648534
Contributeur : Evangelia Zacharaki <>
Soumis le : dimanche 26 novembre 2017 - 16:05:30
Dernière modification le : vendredi 12 janvier 2018 - 10:55:24

Fichier

J28_Zacharaki_PeerJ-CS_2017.pd...
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

Citation

Evangelia Zacharaki. Prediction of protein function using a deep convolutional neural network ensemble. PeerJ Computer Science, PeerJ, 2017, 3, pp.1-17. 〈10.7717/peerj-cs.124〉. 〈hal-01648534〉

Partager

Métriques

Consultations de la notice

47

Téléchargements de fichiers

14