Efficient seeding techniques for protein similarity search

Abstract : We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal sensitivity/selectivity trade-offs. We propose several different design methods and use them to construct several alphabets.We then perform an analysis of seeds built over those alphabet and compare them with the standard Blastp seeding method [2,3], as well as with the family of vector seeds proposed in [4]. While the formalism of subset seed is less expressive (but less costly to implement) than the accumulative principle used in Blastp and vector seeds, our seeds show a similar or even better performance than Blastp on Bernoulli models of proteins compatible with the common BLOSUM62 matrix.
Type de document :
Communication dans un congrès
Elloumi, M and K\"ng, J. and Linial, M. and Murphy, R.F. and Schneider, K. and Toma, C. Proceedings of the 2nd International Conference BIRD, Jul 2008, Vienna, Austria. Springer Berlin Heidelberg, 13, pp.466-478, 2008, Communications in Computer and Information Science. 〈10.1007/978-3-540-70600-7〉
Liste complète des métadonnées

https://hal.inria.fr/inria-00335564
Contributeur : Laurent Noé <>
Soumis le : mercredi 29 octobre 2008 - 20:07:23
Dernière modification le : mardi 6 mars 2018 - 17:40:54
Document(s) archivé(s) le : mardi 28 juin 2011 - 17:34:40

Fichiers

paper.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Mihkail Roytberg, Anna Gambin, Laurent Noé, Slawomir Lasota, Eugenia Furletova, et al.. Efficient seeding techniques for protein similarity search. Elloumi, M and K\"ng, J. and Linial, M. and Murphy, R.F. and Schneider, K. and Toma, C. Proceedings of the 2nd International Conference BIRD, Jul 2008, Vienna, Austria. Springer Berlin Heidelberg, 13, pp.466-478, 2008, Communications in Computer and Information Science. 〈10.1007/978-3-540-70600-7〉. 〈inria-00335564〉

Partager

Métriques

Consultations de la notice

268

Téléchargements de fichiers

173