S. Altschul, Amino acid substitution matrices from an information theoretic perspective, Journal of Molecular Biology, vol.219, issue.3, pp.555-565, 1991.
DOI : 10.1016/0022-2836(91)90193-A

. Altschul, PSI-BLAST pseudocounts and the minimum description length principle, Nucleic Acids Research, vol.37, issue.3, pp.815-824, 2009.
DOI : 10.1093/nar/gkn981

URL : http://doi.org/10.1093/nar/gkn981

. Altschul, The Construction and Use of Log-Odds Substitution Scores for Multiple Sequence Alignment, PLoS Computational Biology, vol.395, issue.7, pp.1-17, 2010.
DOI : 10.1371/journal.pcbi.1000852.s009

G. Bailey, T. L. Bailey, and M. Gribskov, Score Distributions for Simultaneous Matching to Multiple Motifs, Journal of Computational Biology, vol.4, issue.1, pp.45-59, 1997.
DOI : 10.1089/cmb.1997.4.45

. Brown, Using Dirichlet mixture priors to derive hidden Markov models for protein families, Proc. First Int. Conf. Intell. Sys, pp.47-55, 1993.

F. Coste, Learning the Language of Biological Sequences, Molecular biology of bacteriophage T4, 2016.
DOI : 10.1007/978-3-662-48395-4_8

URL : https://hal.archives-ouvertes.fr/hal-01244770

K. Coste, F. Coste, and G. Kerbellec, Learning automata on protein sequences, JOBIM, pp.199-210, 2006.
URL : https://hal.archives-ouvertes.fr/inria-00180429

. Durbin, Biological sequence analysis, 1998.
DOI : 10.1017/CBO9780511790492

. Eddy, Maximum Discrimination Hidden Markov Models of Sequence Consensus, Journal of Computational Biology, vol.2, issue.1, 1995.
DOI : 10.1089/cmb.1995.2.9

S. Eddy and T. J. Wheeler, HMMER User's Guide, 2015.

S. R. Eddy, Maximum likelihood fitting of extreme value distributions, 1997.

S. R. Eddy, A Probabilistic Model of Local Sequence Alignment That Simplifies Statistical Significance Estimation, PLoS Computational Biology, vol.15, issue.5, pp.1-14, 2008.
DOI : 10.1371/journal.pcbi.1000069.g005

. Gerstein, Volume changes in protein evolution, Journal of Molecular Biology, vol.236, issue.4, pp.1067-1078, 1994.
DOI : 10.1016/0022-2836(94)90012-4

. Grundy, meta-MEME: Motif-based hidden Markov models of protein families, Bioinformatics, vol.13, issue.4, pp.397-406, 1997.
DOI : 10.1093/bioinformatics/13.4.397

H. Henikoff, S. Henikoff, and J. Henikoff, Amino acid substitution matrices from protein blocks., Proc. Natl. Acad. Sci, pp.10915-10919, 1992.
DOI : 10.1073/pnas.89.22.10915

URL : https://www.ncbi.nlm.nih.gov/pmc/articles/PMC50453/pdf

H. Henikoff, S. Henikoff, and J. Henikoff, Position-based sequence weights, Journal of Molecular Biology, vol.243, issue.4, pp.574-578, 1994.
DOI : 10.1016/0022-2836(94)90032-9

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.106.3439

H. Henikoff, S. Henikoff, and J. Henikoff, Using substitution probabilities to improve position-specific scoring matrices, Bioinformatics, vol.12, issue.2, pp.135-143, 1996.
DOI : 10.1093/bioinformatics/12.2.135

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.109.6780

. Hughey, Scoring hidden Markov models, Comput. Appl. Biosci, vol.13, issue.2, pp.191-199, 1997.

. Hughey, SAM: sequence alignment and modeling software system, 2003.

S. Johnson, Remote protein homology detection using hidden Markov models, 2006.

. Karlin, S. Karlin, and S. F. Altschul, Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes., Proc. Natl. Acad. Sci, pp.2264-2268, 1990.
DOI : 10.1073/pnas.87.6.2264

. Karplus, Hidden Markov models for detecting remote protein homologies, Bioinformatics, vol.14, issue.10, pp.14846-856, 1998.
DOI : 10.1093/bioinformatics/14.10.846

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.132.5575

. Karplus, Calibrating E-values for hidden Markov models using reverse-sequence null models, Bioinformatics, vol.21, issue.22, pp.4107-4115, 2005.
DOI : 10.1093/bioinformatics/bti629

. Karplus, Calibrating E-values for hidden Markov models using reverse-sequence null models, Bioinformatics, vol.21, issue.22, pp.4107-4115, 2005.
DOI : 10.1093/bioinformatics/bti629

G. Kerbellec, Apprentissage d'automates modelisant des familles de sequences proteiques, 2008.
URL : https://hal.archives-ouvertes.fr/tel-00327938

. Krogh, . Mitchison, A. Krogh, and G. Mitchison, Maximum entropy weighting of aligned sequences of proteins or dna, pp.215-221, 1995.

. Nguyen, Dirichlet Mixtures, the Dirichlet Process, and the Structure of Protein Space, Journal of Computational Biology, vol.20, issue.1, pp.1-18, 2013.
DOI : 10.1089/cmb.2012.0244

L. R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE, pp.257-286, 1989.

. Sjolander, Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology, Bioinformatics, vol.12, issue.4, pp.327-345, 1996.
DOI : 10.1093/bioinformatics/12.4.327

. Sunyaev, PSIC: profile extraction from sequence alignments with position-specific counts of independent observations, Protein Engineering Design and Selection, vol.12, issue.5, pp.387-394, 1999.
DOI : 10.1093/protein/12.5.387

. Thompson, CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice, Nucleic Acids Research, vol.22, issue.22, pp.4673-4680, 1994.
DOI : 10.1093/nar/22.22.4673