Automatic Classification of Protein Structures Using the Maximum Contact Map Overlap Metric

Abstract : In this work, we propose a new distance measure for comparing two protein structures based on their contact map representations. We show that our novel measure, which we refer to as the maximum contact map overlap (max-CMO) metric, satisfies all properties of a metric on the space of protein representations. Having a metric in that space allows one to avoid pairwise comparisons on the entire database and, thus, to significantly accelerate exploring the protein space compared to no-metric spaces. We show on a gold standard superfamily classification benchmark set of 6759 proteins that our exact k-nearest neighbor (k-NN) scheme classifies up to 224 out of 236 queries correctly and on a larger, extended version of the benchmark with 60, 850 additional structures, up to 1361 out of 1369 queries. Our k-NN classification thus provides a promising approach for the automatic classification of protein structures based on flexible contact map overlap alignments.
Type de document :
Article dans une revue
Algorithms, MDPI AG, 2015, Special Issue Algorithmic Themes in Bioinformatics, Volume 8 (Issue 4), pp.20. 〈 Giuseppe Lancia 〉. 〈10.3390/a8040850〉
Liste complète des métadonnées

Littérature citée [24 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01250539
Contributeur : Rumen Andonov <>
Soumis le : mercredi 6 janvier 2016 - 15:06:32
Dernière modification le : mercredi 16 mai 2018 - 11:23:35
Document(s) archivé(s) le : jeudi 7 avril 2016 - 15:47:53

Fichier

algorithms-4-850.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Rumen Andonov, Hristo Djidjev, Klau Gunnar, Mathilde Le Boudic-Jamin, Inken Wohlers. Automatic Classification of Protein Structures Using the Maximum Contact Map Overlap Metric. Algorithms, MDPI AG, 2015, Special Issue Algorithmic Themes in Bioinformatics, Volume 8 (Issue 4), pp.20. 〈 Giuseppe Lancia 〉. 〈10.3390/a8040850〉. 〈hal-01250539〉

Partager

Métriques

Consultations de la notice

423

Téléchargements de fichiers

71