X. String-variables, Matches for instance the sequenceGTTGAGAGGTTGA This model includes the two types of string variables ( , called mute variable, and X) It matches the sequence with X= "GTTGA" (both mute variable values empty) and also with X= "GTT". As in Prolog two mute variables correspond to different variables and may have different values It is useful for describing gaps

X. String-variable-with-morphism, Matches for instance "TCGCGA" with X="TCG". This model describes a biological palindrome. It includes an occurrence of X, followed by a transformation of X by a reverse morphism (direction -) described by non-terminal wc

X. Overlapping-succession, Matches for instance "GATTGAGATT, Here

. Views, TAGTAT Matches for instance the sequence "CCTAGTATCCGATAC". This model states that to be admissible, a sequence must at the same time contains "ATCCGA" and "TAGTAG

S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, Basic local alignment search tool, Journal of Molecular Biology, vol.215, issue.3, pp.403-410, 1990.
DOI : 10.1016/S0022-2836(05)80360-2

D. Angluin, Finding patterns common to a set of strings (Extended Abstract), Proceedings of the eleventh annual ACM symposium on Theory of computing , STOC '79, pp.130-141, 1979.
DOI : 10.1145/800135.804406

D. Betel and C. W. Hogue, Kangaroo -a pattern-matching program for biological sequences, BMC Bioinformatics, vol.3, issue.1, p.20, 2002.
DOI : 10.1186/1471-2105-3-20

B. Billoud, M. Kontic, and A. Viari, Palingol: a declarative programming language to describe nucleic acids' secondary structures and to scan sequence database, Nucleic Acids Research, vol.24, issue.8, 1996.
DOI : 10.1093/nar/24.8.1395

A. Brazma and D. Gilbert, A pattern language for molecular biology, 1995.

V. Brendel and H. Busse, Genome structure described by formal languages, Nucleic Acids Research, vol.12, issue.5, pp.2561-2568, 1984.
DOI : 10.1093/nar/12.5.2561

P. Bucher and A. Bairoch, A generalized profile syntax for biomolecular sequences motifs and its function in automatic sequence interpretation

G. , D. Penna, B. Intrigila, E. Tronci, and M. Venturini-zilli, Synchronized regular expressions, Acta Inf, vol.39, issue.1, pp.31-70, 2003.

S. Dong and D. B. Searls, Gene Structure Prediction by Linguistic Methods, Genomics, vol.23, issue.3, pp.540-551, 1994.
DOI : 10.1006/geno.1994.1541

R. Dowell and S. Eddy, Evaluation of several lightweight stochastic contextfree grammars for RNA secondary structure prediction, BMC Bioinformatics, vol.5, 2004.

I. Eidhammer, I. Jonassen, S. H. Grindhaug, D. Gilbert, and M. Ratnayake, A constraint based structure description language for biosequences, Constraints, vol.6, issue.2/3, pp.173-200, 2001.
DOI : 10.1023/A:1011481521835

S. Gräf, D. Strothmann, S. Kurtz, and G. Steger, HyPaLib: a database of RNAs and RNA structural elements defined by hybrid patterns, Nucleic Acids Research, vol.29, issue.1, pp.196-198, 2001.
DOI : 10.1093/nar/29.1.196

L. Grate, M. Hebster, R. Hughey, D. Haussler, I. S. Mian et al., RNA modeling using gibbs sampling and stochastic context free grammars, Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp.1501-1531, 1994.

C. Helgesen and P. R. Sibbald, PALM -A pattern language for molecular biology, Proceedings of the First International Conference on Intelligent Systems for Molecular Biology(ISMB-93), pp.172-180, 1993.

A. K. Joshi, K. Vijay-shanker, and D. Weir, The convergence of midly context-sensitive grammars, The Processing of Natural Language Structure, pp.31-81, 1991.

S. Leung, C. Mellish, and D. Robertson, Basic Gene Grammars and DNA-ChartParser for language processing of Escherichia coli promoter DNA sequences, Bioinformatics, vol.17, issue.3, pp.226-236, 2001.
DOI : 10.1093/bioinformatics/17.3.226

C. Loose, K. Jensen, I. Rigoutsos, and G. Stephanopoulos, A linguistic model for the rational design of antimicrobial peptides, Nature, vol.78, issue.7113, pp.867-869, 2006.
DOI : 10.1038/nature05233

I. Makalowska, C. Lin, and W. Makalowski, Overlapping genes in vertebrate genomes, Computational Biology and Chemistry, vol.29, issue.1, pp.1-12, 2005.
DOI : 10.1016/j.compbiolchem.2004.12.006

H. Mangalam, tacg -a grep for dna, BMC Bioinformatics, vol.3, issue.8, 2002.

J. Mrázek and S. Xie, Pattern locator: a new tool for finding local sequence patterns in genomic DNA sequences, Bioinformatics, vol.22, issue.24, pp.3099-3100, 2006.
DOI : 10.1093/bioinformatics/btl551

F. Nicolas and E. Rivals, Hardness results for the center and median string problems under the weighted and unweighted edit distances, Journal of Discrete Algorithms, vol.3, issue.2-4, pp.390-415, 2005.
DOI : 10.1016/j.jda.2004.08.015

URL : https://hal.archives-ouvertes.fr/lirmm-00105320

J. Nicolas, P. Durand, G. Ranchy, S. Tempel, and A. Valin, Suffix-tree analyser (STAN): looking for nucleotidic and peptidic patterns in chromosomes, Bioinformatics, vol.21, issue.24, pp.4408-4410, 2005.
DOI : 10.1093/bioinformatics/bti710

P. A. Pevzner and S. Sze, Combinatorial approaches to finding subtle signals in dna sequences, Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, pp.269-278, 2000.

W. Plandowski, An efficient algorithm for solving word equations, Proceedings of the thirty-eighth annual ACM symposium on Theory of computing , STOC '06, pp.467-476, 2006.
DOI : 10.1145/1132516.1132584

A. Rajasekar, Applications in constraint logic programming with strings, Principles and Practice of Constraint Programming, pp.109-122, 1994.
DOI : 10.1007/3-540-58601-6_94

I. Rice, P. Longden, and A. Bleasby, EMBOSS: The European Molecular Biology Open Software Suite, Trends in Genetics, vol.16, issue.6, pp.276-277, 2000.
DOI : 10.1016/S0168-9525(00)02024-2

Y. Sakakibara, M. Brown, R. Hughey, I. S. Mian, K. Sjölander et al., Recent methods for RNA modeling using stochastic context-free grammars, Proceedings of the Asilomar Conference on Combinatorial Pattern Matching, pp.289-306, 1994.
DOI : 10.1007/3-540-58094-8_25

I. Salvador and J. Benedi, RNA MODELING BY COMBINING STOCHASTIC CONTEXT-FREE GRAMMARS AND n-GRAM MODELS, International Journal of Pattern Recognition and Artificial Intelligence, vol.16, issue.03, pp.309-315, 2002.
DOI : 10.1142/S0218001402001691

D. B. Searls, Formal grammars for intermolecular structure, Proceedings First International Symposium on Intelligence in Neural and Biological Systems. INBS'95, pp.30-37, 1995.
DOI : 10.1109/INBS.1995.404291

D. B. Searls, String variable grammar: A logic grammar formalism for the biological language of DNA, The Journal of Logic Programming, vol.24, issue.1-2, pp.73-102, 1995.
DOI : 10.1016/0743-1066(95)00034-H

D. B. Searls and S. Dong, A SYNTACTIC PATTERN RECOGNITION SYSTEM FOR DNA SEQUENCES, Bioinformatics, Supercomputing and Complex Genome Analysis, pp.89-101, 1993.
DOI : 10.1142/9789814503655_0008

D. Strothmann, S. A. Gräf, S. Kurtz, and G. Steger, The syntax and semantics of a language for describing complex patterns in biological sequences, 2000.

Y. Uemura, A. Hasegawa, S. Kobayashi, and T. Yokomori, Grammatically modeling and predicting RNA secondary structures, Proceedings of 6th Genome Informatics Workshop, pp.67-76, 1995.

I. Unité-de-recherche and I. Rennes, Campus universitaire de Beaulieu -35042 Rennes Cedex (France) Unité de recherche INRIA Futurs : Parc Club Orsay Université -ZAC des Vignes 4

I. Unité-de-recherche and . Lorraine, Technopôle de Nancy-Brabois -Campus scientifique 615, rue du Jardin Botanique -BP 101 -54602 Villers-lès-Nancy Cedex (France) Unité de recherche INRIA Rhône-Alpes : 655, avenue de l'Europe -38334 Montbonnot Saint-Ismier (France) Unité de recherche INRIA Rocquencourt : Domaine de Voluceau -Rocquencourt -BP 105 -78153 Le Chesnay Cedex (France) Unité de recherche, 2004.

I. De-voluceau-rocquencourt, BP 105 -78153 Le Chesnay Cedex (France) http://www.inria.fr ISSN, pp.249-6399