Exact Protein Structure Classification Using the Maximum Contact Map Overlap Metric

Abstract : In this work we propose a new distance measure for compar-ing two protein structures based on their contact map representations. We show that our novel measure, which we refer to as the maximum contact map overlap (max-CMO) metric, satisfies all properties of a metric on the space of protein representations. Having a metric in that space allows to avoid pairwise comparisons on the entire database and thus to significantly accelerate exploring the protein space compared to no-metric spaces. We show on a small gold-standard superfamily classification benchmark set of 6, 759 proteins that our exact scheme classifies up to 224 out of 236 queries correctly and on an larger, extended version of the benchmark up to 1361 out of 1369 queries. Our k-NN classification thus provides a promising approach for the automatic classification of protein structures into SCOP or CATH based on flexible contact map overlap alignments.
Type de document :
Rapport
[Research Report] INRIA Rennes - Bretagne Atlantique and University of Rennes 1, France; Genome Informatics, University of Duisburg-Essen, Germany; Life Sciences, CWI, Science Park 123, 1098 XG Amsterdam, The Netherlands; Los Alamos National Laboratory, Los Alamos, NM, USA. 2014, pp.262 - 273
Liste complète des métadonnées

Littérature citée [20 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01093776
Contributeur : Mathilde Le Boudic-Jamin <>
Soumis le : vendredi 12 décembre 2014 - 15:29:08
Dernière modification le : mercredi 16 mai 2018 - 11:23:35
Document(s) archivé(s) le : samedi 15 avril 2017 - 07:06:35

Fichier

report.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Inken Wohlers, Mathilde Le Boudic-Jamin, Hristo Djidjev, Gunnar W. Klau, Rumen Andonov. Exact Protein Structure Classification Using the Maximum Contact Map Overlap Metric. [Research Report] INRIA Rennes - Bretagne Atlantique and University of Rennes 1, France; Genome Informatics, University of Duisburg-Essen, Germany; Life Sciences, CWI, Science Park 123, 1098 XG Amsterdam, The Netherlands; Los Alamos National Laboratory, Los Alamos, NM, USA. 2014, pp.262 - 273. 〈hal-01093776〉

Partager

Métriques

Consultations de la notice

361

Téléchargements de fichiers

100