Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Parallel seed-based approach to multiple protein structure similarities detection

Abstract : Finding similarities between protein structures is a crucial task in molecular biology. Most of the existing tools require proteins to be aligned in order-preserving way and only find single alignments even when multiple similar regions exist. We propose a new seed-based ap-proach that discovers multiple pairs of similar regions. Its computa-tional complexity is polynomial and it comes with a quality guarantee– the returned alignments have both Root Mean Squared Deviations (coordinate-based as well as internal-distances based) lower than a given threshold, if such exist. We do not require the alignments to be order preserving (i.e. we consider non-sequential alignments), which makes our algorithm suitable for detecting similar domains when com-paring multi-domain proteins as well as to detect structural repetitions within a single protein. Because the search space for non-sequential alignments is much larger than for sequential ones, the computational burden is addressed by extensive use of parallel computing techniques: a coarse-grain level parallelism making use of available CPU cores for computation and a fine-grain level parallelism exploiting bit-level con-currency as well as vector instructions.
Document type :
Preprints, Working Papers, ...
Complete list of metadata

Cited literature [34 references]  Display  Hide  Download
Contributor : Mathilde Le Boudic-Jamin Connect in order to contact the contributor
Submitted on : Thursday, December 11, 2014 - 10:56:08 AM
Last modification on : Tuesday, October 19, 2021 - 11:58:56 PM
Long-term archiving on: : Thursday, March 12, 2015 - 10:21:31 AM


Files produced by the author(s)


  • HAL Id : hal-01093809, version 1


Guillaume Chapuis, Mathilde Le Boudic-Jamin, Rumen Andonov, Hristo Djidjev, Dominique Lavenier. Parallel seed-based approach to multiple protein structure similarities detection . 2014. ⟨hal-01093809⟩



Les métriques sont temporairement indisponibles