S. Abdedda¨?mabdedda¨?m and B. Morgenstern, Speeding up the dialign multiple alignment program by using the 'greedy alignment of biological sequences library' (gabios-lib), In JOBIM, pp.1-11, 2000.

S. F. Altschul, W. Gish, E. W. Miller, and D. J. Lipman, Basic local alignment search tool, Journal of Molecular Biology, vol.215, issue.3, pp.403-410, 1990.
DOI : 10.1016/S0022-2836(05)80360-2

A. Ashkenazi, Ttargeting death and decoy receptors of the tumour-necrosis factor superfamily, Nature Reviews Cancer, 2002.

T. L. Bailey and C. Elkan, Fitting a mixture model by expectation maximization to discover motifs in biopolymers, Second International Conference on Intelligent Systems for Molecular Biology, 1994.

M. Blanchette, W. J. Kent, C. Riemer, L. Elnitski, A. Smit et al., Aligning Multiple Genomic Sequences With the Threaded Blockset Aligner, Genome Research, vol.14, issue.4, pp.708-715, 2004.
DOI : 10.1101/gr.1933104

A. Brazma, I. Jonassen, I. Eidhammer, and D. Gilbert, Approaches to the Automatic Discovery of Patterns in Biosequences, Journal of Computational Biology, vol.5, issue.2, pp.277-304, 1998.
DOI : 10.1089/cmb.1998.5.279

A. Califano, SPLASH: structural pattern localization analysis by sequential histograms, Bioinformatics, vol.16, issue.4, pp.341-357, 2000.
DOI : 10.1093/bioinformatics/16.4.341

F. Coste and D. Fredouille, What is the search space for the inference of nondeterministic, unambiguous and deterministic automata ?, 2003.

F. Coste and G. Kerbellec, A Similar Fragments Merging Approach to Learn Automata on Proteins, ECML, pp.522-529, 2005.
DOI : 10.1007/11564096_50

URL : https://hal.archives-ouvertes.fr/inria-00070340

F. Coste, G. Kerbellec, B. Idmont, D. Fredouille, and C. Delamarche, Apprentissage d'automates par fusions de paires de fragments significativement similaires etpremì eres expérimentations sur les protéines mip, In JOBIM, 2004.

R. Durbin, S. R. Eddy, A. Krogh, and G. Mitchison, Biological Sequence Analysis : Probabilistic Models of Proteins and Nucleic Acids, 1999.
DOI : 10.1017/CBO9780511790492

K. Karkouri, H. Gueune, and C. Delamarche, MIPDB: a relational database dedicated to MIP family proteins, Biology of the Cell, vol.99, issue.7, pp.535-543, 2005.
DOI : 10.1042/BC20040123

URL : https://hal.archives-ouvertes.fr/hal-00107987

S. Henikoff and J. G. Henikoff, Amino acid substitution matrices from protein blocks., Proc. Natl. Acad. Sci. USA, pp.10915-10919, 1992.
DOI : 10.1073/pnas.89.22.10915

D. Hernández, R. Gras, and R. D. Appel, MoDEL: an efficient strategy for ungapped local multiple alignment, Computational Biology and Chemistry, vol.28, issue.2, pp.119-128, 2004.
DOI : 10.1016/j.compbiolchem.2004.01.001

K. L. Jensen, M. P. Styczynski, I. Rigoutsos, and G. N. Stephanopoulos, A generic motif discovery algorithm for sequential data, Bioinformatics, vol.22, issue.1, 2005.
DOI : 10.1093/bioinformatics/bti745

J. F. Jonassen, I. Collins, and D. Higgins, Finding flexible patterns in unaligned protein sequences, Protein Science, vol.22, issue.8, pp.1587-1595, 1995.
DOI : 10.1002/pro.5560040817

K. Karplus, Hidden Markov models for detecting remote protein homologies, Bioinformatics, vol.14, issue.10, pp.846-865, 1998.
DOI : 10.1093/bioinformatics/14.10.846

K. J. Lang, B. A. Pearlmutter, and R. A. Price, Results of the Abbadingo one DFA learning competition and a new evidence-driven state merging algorithm, Lecture Notes in Computer Science, vol.1433, pp.1-12, 1998.
DOI : 10.1007/BFb0054059

C. E. Lawrence, S. F. Altschul, M. S. Boguski, J. S. Liu, A. F. Neuwald et al., Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment, Science, vol.262, issue.5131
DOI : 10.1126/science.8211139

C. Lee, C. Grasso, and M. Sharlow, Multiple sequence alignment using partial order graphs, Bioinformatics, vol.18, issue.3, pp.452-464, 2002.
DOI : 10.1093/bioinformatics/18.3.452

I. C. Lerman, J. Azé, H. Briand, M. Sebag, and R. , Indice probabiliste discriminant de vraisemblance du lien pour des données volumineuses. RNTI-E-1, numéro spécial, pp.69-94, 2004.

B. Morgenstern, DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment, Bioinformatics, vol.15, issue.3, pp.211-218, 1999.
DOI : 10.1093/bioinformatics/15.3.211

C. Nevill-manning and I. Witten, Identifying hierarchical structure in sequences: A linear-time algorithm, Journal of Artificial Intelligence Research, vol.7, pp.67-82, 1997.

C. G. Nevill-manning, T. D. Wu, L. Brutlag, and D. , Highly specific protein sequence motifs for genome analysis, Proceedings of the National Academy of Sciences, vol.95, issue.11, pp.955865-5871, 1998.
DOI : 10.1073/pnas.95.11.5865

J. Oncina and P. Garcia, Inferring regular languages in polynomial update time, Pattern Recognition and Image Analysis, pp.49-61, 1992.

P. A. Pevzner, H. Tang, and G. Tesler, De Novo Repeat Classification and Fragment Assembly, Genome Research, vol.14, issue.9, pp.1786-1796, 2004.
DOI : 10.1101/gr.2395204

I. Rigoutsos and A. Floratos, Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm [published erratum appears in Bioinformatics 1998;14(2):229], Bioinformatics, vol.14, issue.1, pp.55-67, 1998.
DOI : 10.1093/bioinformatics/14.1.55

I. Rigoutsos, A. Floratos, L. Parida, Y. Gao, and D. Platt, The Emergence of Pattern Discovery Techniques in Computational Biology, Metabolic Engineering, vol.2, issue.3, pp.159-177, 2000.
DOI : 10.1006/mben.2000.0151

S. Eddy, Hmmer user's guide: biological sequence analysis using prole hidden markov models, 1998.

Y. Sakakibara, Grammatical inference in bioinformatics, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.27, issue.7, pp.1051-1062, 2005.
DOI : 10.1109/TPAMI.2005.140

D. B. Searls, The language of genes, Nature, vol.10, issue.6912, pp.211-217, 2002.
DOI : 10.1038/29667

Z. Solan, D. Horn, E. Ruppin, and S. Edelman, Unsupervised learning of natural languages, Proceedings of the National Academy of Sciences, vol.102, issue.33, pp.11629-11634, 2005.
DOI : 10.1073/pnas.0409746102

W. R. Taylor, The classification of amino acid conservation, Journal of Theoretical Biology, vol.119, issue.2, pp.205-218, 1986.
DOI : 10.1016/S0022-5193(86)80075-3

T. Yokomori, Learning non-deterministic finite automata from queries and counterexamples, Machine Intelligence, vol.13, pp.169-189, 1994.

E. M. Zdobnov and R. Apweiler, InterProScan - an integration platform for the signature-recognition methods in InterPro, Bioinformatics, vol.17, issue.9, pp.847-848, 2001.
DOI : 10.1093/bioinformatics/17.9.847