Parallel Seed-Based Approach to Multiple Protein Structure Similarities Detection

Guillaume Chapuis 1 Mathilde Le Boudic-Jamin 1 Rumen Andonov 1, * Hristo Djidjev 2 Dominique Lavenier 1
* Auteur correspondant
1 GenScale - Scalable, Optimized and Parallel Algorithms for Genomics
Inria Rennes – Bretagne Atlantique , IRISA-D7 - GESTION DES DONNÉES ET DE LA CONNAISSANCE
Abstract : Finding similarities between protein structures is a crucial task in molecular biology. Most of the existing tools require proteins to be aligned in order-preserving way and only find single alignments even when multiple similar regions exist. We propose a new seed-based approach that discovers multiple pairs of similar regions. Its computational complexity is polynomial and it comes with a quality guarantee—the returned alignments have both root mean squared deviations (coordinate-based as well as internal-distances based) lower than a given threshold, if such exist. We do not require the alignments to be order preserving (i.e., we consider nonsequential alignments), which makes our algorithm suitable for detecting similar domains when comparing multidomain proteins as well as to detect structural repetitions within a single protein. Because the search space for nonsequential alignments is much larger than for sequential ones, the computational burden is addressed by extensive use of parallel computing techniques: a coarse-grain level parallelism making use of available CPU cores for computation and a fine-grain level parallelism exploiting bit-level concurrency as well as vector instructions.
Type de document :
Article dans une revue
Scientific Programming, IOS Press, 2015, 2015, 〈10.1155/2015/279715〉
Liste complète des métadonnées

Littérature citée [34 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01235331
Contributeur : Dominique Lavenier <>
Soumis le : jeudi 3 décembre 2015 - 08:41:23
Dernière modification le : mardi 16 janvier 2018 - 15:54:20
Document(s) archivé(s) le : samedi 29 avril 2017 - 00:08:28

Fichier

ppam2013_submission_241.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Guillaume Chapuis, Mathilde Le Boudic-Jamin, Rumen Andonov, Hristo Djidjev, Dominique Lavenier. Parallel Seed-Based Approach to Multiple Protein Structure Similarities Detection. Scientific Programming, IOS Press, 2015, 2015, 〈10.1155/2015/279715〉. 〈hal-01235331〉

Partager

Métriques

Consultations de la notice

311

Téléchargements de fichiers

60