Parallel seed-based approach to multiple protein structure similarities detection

Abstract : Finding similarities between protein structures is a crucial task in molecular biology. Most of the existing tools require proteins to be aligned in order-preserving way and only find single alignments even when multiple similar regions exist. We propose a new seed-based ap-proach that discovers multiple pairs of similar regions. Its computa-tional complexity is polynomial and it comes with a quality guarantee– the returned alignments have both Root Mean Squared Deviations (coordinate-based as well as internal-distances based) lower than a given threshold, if such exist. We do not require the alignments to be order preserving (i.e. we consider non-sequential alignments), which makes our algorithm suitable for detecting similar domains when com-paring multi-domain proteins as well as to detect structural repetitions within a single protein. Because the search space for non-sequential alignments is much larger than for sequential ones, the computational burden is addressed by extensive use of parallel computing techniques: a coarse-grain level parallelism making use of available CPU cores for computation and a fine-grain level parallelism exploiting bit-level con-currency as well as vector instructions.
Type de document :
Pré-publication, Document de travail
2014
Liste complète des métadonnées

Littérature citée [34 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01093809
Contributeur : Mathilde Le Boudic-Jamin <>
Soumis le : jeudi 11 décembre 2014 - 10:56:08
Dernière modification le : mercredi 16 mai 2018 - 11:23:35
Document(s) archivé(s) le : jeudi 12 mars 2015 - 10:21:31

Fichier

sp_main.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01093809, version 1

Citation

Guillaume Chapuis, Mathilde Le Boudic-Jamin, Rumen Andonov, Hristo Djidjev, Dominique Lavenier. Parallel seed-based approach to multiple protein structure similarities detection . 2014. 〈hal-01093809〉

Partager

Métriques

Consultations de la notice

323

Téléchargements de fichiers

132