Parallelization on graphic hardware : contributions to RNA folding and sequence alignment

Guillaume Rizk 1
1 SYMBIOSE - Biological systems and models, bioinformatics and sequences
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : Bioinformatics require the analysis of large amounts of data. With the recent advent of next generation sequencing technologies generating data at a cheap cost, the computational power needed has increased dramatically. Graphic Processing Units (GPU) are now programmable beyond simple graphic computations, providing cheap high performance for general purpose applications. This thesis explores the usage of GPUs for bioinformatics applications. First, this work focuses on the computation of secondary structures of RNA sequences. It is traditionally conducted with a dynamic programming algorithm, which poses significant challenges for a GPU implementation. We introduce a new tiled implementation providing good data locality and therefore very efficient GPU code. We note that our algorithmic modification also enables tiling and subsequent vectorization of the CPU program, allowing us to conduct a fair CPU-GPU comparison. Secondly, this works addresses the short sequence alignment problem. We present an attempt at GPU parallelization using the seed-and-extend paradigm. Since this attempt is unsuccessful, we then focus on the development of a program running on CPU. Our main contribution is the development of a new algorithm filtering candidate alignment locations quickly, based on the pre computation of tiles of the dynamic programming matrix. This new algorithm proved to be in fact more effective on a sequential CPU program and lead to an efficient new CPU aligner. Our work provides the example of both successful an unsuccessful attempts at GPU parallelization. These two points of view allow us to evaluate GPUs efficiency and the role they can play in bioinformatics.
Document type :
Theses
Liste complète des métadonnées

Cited literature [80 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00634901
Contributor : Sébastien Erhel <>
Submitted on : Monday, October 24, 2011 - 11:47:47 AM
Last modification on : Friday, November 16, 2018 - 1:25:23 AM
Document(s) archivé(s) le : Thursday, November 15, 2012 - 10:22:12 AM

Identifiers

  • HAL Id : tel-00634901, version 1

Citation

Guillaume Rizk. Parallelization on graphic hardware : contributions to RNA folding and sequence alignment. Computer Science [cs]. Université Rennes 1, 2011. English. ⟨NNT : 2011REN1S021⟩. ⟨tel-00634901⟩

Share

Metrics

Record views

468

Files downloads

326