Finding and Characterizing Repeats in Plant Genomes

Jacques Nicolas; Pierre Peterlongo; Sébastien Tempel

doi:10.1007/978-1-4939-3167-5_17

Chapitre D'ouvrage Année : 2015

Finding and Characterizing Repeats in Plant Genomes

(1) , (2) , (3)

1
2
3

Jacques Nicolas

Fonction : Auteur
PersonId : 5225
IdHAL : jacques-nicolas
IdRef : 116276142

Dynamics, Logics and Inference for biological Systems and Sequences

Pierre Peterlongo

Fonction : Auteur
PersonId : 171998
IdHAL : pierre-peterlongo
ORCID : 0000-0003-0776-6407
IdRef : 12482062X

Scalable, Optimized and Parallel Algorithms for Genomics

Sébastien Tempel

Fonction : Auteur

Laboratoire de chimie bactérienne

Résumé

Plant genomes contain a particularly high proportion of repeated structures of various types. This chapter proposes a guided tour of available software that can help biologists to look for these repeats and check some hypothetical models intended to characterize their structures. Since transposable elements are a major source of repeats in plants, many methods have been used or developed for this large class of sequences. They are representative of the range of tools available for other classes of repeats and we have provided a whole section on this topic as well as a selection of the main existing software. In order to better understand how they work and how repeats may be efficiently found in genomes, it is necessary to look at the technical issues involved in the large--scale search of these structures. Indeed, it may be hard to keep up with the profusion of proposals in this dynamic field and the rest of the chapter is devoted to the foundations of the search for repeats and more complex patterns. The second section introduces the key concepts that are useful for understanding the current state of the art in playing with words, applied to genomic sequences. This can be seen as the first stage of a very general approach called linguistic analysis that is interested in the analysis of natural or artificial texts. Words, the lexical level, correspond to simple repeated entities in texts or strings. In fact, biologists need to represent more complex entities where a repeat family is built on more abstract structures, including direct or inverted small repeats, motifs, composition constraints as well as ordering and distance constraints between these elementary blocks. In terms of linguistics, this corresponds to the syntactic level of a language. The last section introduces concepts and practical tools that can be used to reach this syntactic level in biological sequence analysis.

Mots clés

Repeats Transposon Indexing Algorithmics on words Pattern matching

Domaines

Bio-informatique [q-bio.QM]

Fichier principal

plant_repeats_nicolas_finHAL.pdf (771.42 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Jacques Nicolas : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01228488

Soumis le : jeudi 19 novembre 2015-17:15:44

Dernière modification le : vendredi 24 mars 2023-14:53:01

Archivage à long terme le : vendredi 28 avril 2017-20:03:22

Dates et versions

hal-01228488 , version 1 (19-11-2015)

Identifiants

HAL Id : hal-01228488 , version 1
DOI : 10.1007/978-1-4939-3167-5_17

Citer

Jacques Nicolas, Pierre Peterlongo, Sébastien Tempel. Finding and Characterizing Repeats in Plant Genomes. David Edwards. Plant Bioinformatics: Methods and Protocols, 1374, Humana Press - Springer Science+Business Media, pp.365, 2015, Methods in Molecular Biology, 978-1-4939-3166-8. ⟨10.1007/978-1-4939-3167-5_17⟩. ⟨hal-01228488⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

IRD INSTITUT-TELECOM UNIV-RENNES1 CNRS INRIA UNIV-AMU INSA-RENNES IRISA CENTRALESUPELEC IRISA-D7 INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

337 Consultations

1087 Téléchargements

Finding and Characterizing Repeats in Plant Genomes

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager