Identifying SNPs without a reference genome by comparing raw reads

Pierre Peterlongo 1, * Nicolas Schnel 1 Nadia Pisanti 2 Marie-France Sagot 3 Vincent Lacroix 4
* Auteur correspondant
1 SYMBIOSE - Biological systems and models, bioinformatics and sequences
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
3 BAMBOO - An algorithmic view on genomes, cells, and environments
Inria Grenoble - Rhône-Alpes, LBBE - Laboratoire de Biométrie et Biologie Evolutive
Abstract : Abstract. Next generation sequencing (NGS) technologies are being applied to many fields of biology, notably to survey the polymorphism across individuals of a species. However, while single nucleotide polymor- phisms (SNPs) are almost routinely identified in model organisms, the detection of SNPs in non model species remains very challenging due to the fact that almost all methods rely on the use of a reference genome. We address here the problem of identifying SNPs without a reference genome. For this, we propose an approach which compares two sets of raw reads. We show that a SNP corresponds to a recognisable pattern in the de Bruijn graph built from the reads, and we propose algorithms to identify these patterns, that we call mouths. We outline the potential of our method on real data. The method is tailored to short reads (typ- ically Illumina), and works well even when the coverage is low where it reports few but highly confident SNPs. Our program, called kisSnp, can be downloaded here: http://alcovna.genouest.org/kissnp/.
Type de document :
Communication dans un congrès
String Processing and Information Retrieval, Oct 2010, Los Cabos, Mexico. 2010, Lecture Notes in Computer Science
Liste complète des métadonnées

Littérature citée [10 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00514887
Contributeur : Pierre Peterlongo <>
Soumis le : vendredi 3 septembre 2010 - 15:38:26
Dernière modification le : mercredi 29 novembre 2017 - 16:21:30
Document(s) archivé(s) le : mardi 23 octobre 2012 - 15:31:08

Fichier

kisSnp_spire_reviewed.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00514887, version 1

Citation

Pierre Peterlongo, Nicolas Schnel, Nadia Pisanti, Marie-France Sagot, Vincent Lacroix. Identifying SNPs without a reference genome by comparing raw reads. String Processing and Information Retrieval, Oct 2010, Los Cabos, Mexico. 2010, Lecture Notes in Computer Science. 〈inria-00514887〉

Partager

Métriques

Consultations de la notice

748

Téléchargements de fichiers

581