Skip to Main content Skip to Navigation
New interface

De l'intérêt des modèles grammaticaux pour la reconnaissance de motifs dans les séquences génomiques

Abstract : This thesis studies the interest to look for patterns in genomic sequences using grammars. Since the 80s, work has shown that, in theory, high level grammars offer enough expressivity to allow the description of complex biological patterns. In particular David Searls has proposed a new grammar dedicated to biology: string variable grammar (SVG). This formalism has resulted in Logol, a grammatical language and an analysis tool developed by Dyliss team where this thesis is taking place. Logol is a language designed to be flexible enough to express a wide range of biological patterns. The fact that the grammars remain unknown to model biological patterns raises questions. Is the grammatical formalism really relevant to the recognition of biological patterns? This thesis attempts to answer this question through an exploratory approach. We study the relevance of using the grammatical patterns, by using Logol on six different applications of genomic pattern matching. Through the practical resolution of biological problems, we have highlighted some features of grammatical patterns. First, the use of grammatical models presents a cost in terms of performance. Second the expressiveness of grammatical models covers a broad spectrum of biological patterns, unlike the others alternatives, and some patterns modeled by grammars have no other alternative solutions. It also turns out that for some complex patterns, such as those combining sequence and structure, the grammatical approach is the most suitable. Finally, a thesis conclusion is that there was no real competition between different approaches, but rather everything to gain from successful cooperation.
Document type :
Complete list of metadata

Cited literature [129 references]  Display  Hide  Download
Contributor : ABES STAR :  Contact
Submitted on : Tuesday, February 28, 2017 - 11:35:24 AM
Last modification on : Wednesday, February 2, 2022 - 3:52:51 PM
Long-term archiving on: : Monday, May 29, 2017 - 1:26:01 PM


Version validated by the jury (STAR)


  • HAL Id : tel-01416734, version 2


Aymeric Antoine-Lorquin. De l'intérêt des modèles grammaticaux pour la reconnaissance de motifs dans les séquences génomiques. Bio-informatique [q-bio.QM]. Université Rennes 1, 2016. Français. ⟨NNT : 2016REN1S086⟩. ⟨tel-01416734v2⟩



Record views


Files downloads