Finding and Characterizing Repeats in Plant Genomes - Archive ouverte HAL Access content directly
Book Sections Year : 2015

Finding and Characterizing Repeats in Plant Genomes

(1) , (2) , (3)
1
2
3

Abstract

Plant genomes contain a particularly high proportion of repeated structures of various types. This chapter proposes a guided tour of available software that can help biologists to look for these repeats and check some hypothetical models intended to characterize their structures. Since transposable elements are a major source of repeats in plants, many methods have been used or developed for this large class of sequences. They are representative of the range of tools available for other classes of repeats and we have provided a whole section on this topic as well as a selection of the main existing software. In order to better understand how they work and how repeats may be efficiently found in genomes, it is necessary to look at the technical issues involved in the large--scale search of these structures. Indeed, it may be hard to keep up with the profusion of proposals in this dynamic field and the rest of the chapter is devoted to the foundations of the search for repeats and more complex patterns. The second section introduces the key concepts that are useful for understanding the current state of the art in playing with words, applied to genomic sequences. This can be seen as the first stage of a very general approach called linguistic analysis that is interested in the analysis of natural or artificial texts. Words, the lexical level, correspond to simple repeated entities in texts or strings. In fact, biologists need to represent more complex entities where a repeat family is built on more abstract structures, including direct or inverted small repeats, motifs, composition constraints as well as ordering and distance constraints between these elementary blocks. In terms of linguistics, this corresponds to the syntactic level of a language. The last section introduces concepts and practical tools that can be used to reach this syntactic level in biological sequence analysis.
Fichier principal
Vignette du fichier
plant_repeats_nicolas_finHAL.pdf (771.42 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

hal-01228488 , version 1 (19-11-2015)

Identifiers

Cite

Jacques Nicolas, Pierre Peterlongo, Sébastien Tempel. Finding and Characterizing Repeats in Plant Genomes. David Edwards. Plant Bioinformatics: Methods and Protocols, 1374, Humana Press - Springer Science+Business Media, pp.365, 2015, Methods in Molecular Biology, 978-1-4939-3166-8. ⟨10.1007/978-1-4939-3167-5_17⟩. ⟨hal-01228488⟩
312 View
889 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More