KLAST: fast and sensitive software to compare large genomic databanks on cloud

Ivaylo Petrov 1 Sébastien Brillet 1 Erwan Drezen 1 Sylvain Quiniou 2 L Antin 2 Patrick Durand 2 Dominique Lavenier 1
1 GenScale - Scalable, Optimized and Parallel Algorithms for Genomics
IRISA-D7 - GESTION DES DONNÉES ET DE LA CONNAISSANCE, Inria Rennes – Bretagne Atlantique
Abstract : As the genomic data generated by high throughput sequencing machines continue to exponentially grow, the need for very efficient bioinformatics tools to extract relevant knowledge from this mass of data doesn't weaken. Comparing sequences is still a major task in this discovering process, but tends to be more and more time-consuming. KLAST is a sequence comparison software optimized to compare two nucleotides or proteins data sets, typically a set of query sequences and a reference bank. Performances of KLAST are obtained by a new indexing scheme, an optimized seed-extend methodology, and a multi-level parallelism implementation. To scale up to NGS data processing, a Hadoop version has been designed. Experiments demonstrate a good scalability and a large speed-up over BLAST, the reference software of the domain. In addition, computation can be optionally performed on compressed data without any loss in performances.
Type de document :
Communication dans un congrès
World Congress in Computer Science, Computer Engineering, and Applied Computing, Jul 2015, Las Vegas,, United States
Liste complète des métadonnées

Littérature citée [11 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01235339
Contributeur : Dominique Lavenier <>
Soumis le : lundi 30 novembre 2015 - 09:32:09
Dernière modification le : mercredi 16 mai 2018 - 11:23:35
Document(s) archivé(s) le : samedi 29 avril 2017 - 00:03:18

Fichier

BioComp_BIC2743.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01235339, version 1

Citation

Ivaylo Petrov, Sébastien Brillet, Erwan Drezen, Sylvain Quiniou, L Antin, et al.. KLAST: fast and sensitive software to compare large genomic databanks on cloud. World Congress in Computer Science, Computer Engineering, and Applied Computing, Jul 2015, Las Vegas,, United States. 〈hal-01235339〉

Partager

Métriques

Consultations de la notice

465

Téléchargements de fichiers

183