S. F. Altschul, T. L. Madden, A. A. Schffer, J. Zhang, Z. Zhang et al., , 1997.

, Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Research, vol.25, pp.3389-3402

E. Asgari and M. R. Mofrad, Continuous distributed representation of biological sequences for deep proteomics and genomics, PloS one, vol.10, issue.11, p.141287, 2015.

T. M. Bakheet and A. J. Doig, Properties and identification of human protein drug targets, Bioinformatics, vol.25, issue.4, pp.451-457, 2009.

B. Berger, N. M. Daniels, Y. , and Y. W. , Computational biology in the 21st century: Scaling with compressive algorithms, Commun. ACM, vol.59, issue.8, pp.72-80, 2016.

C. Cai, L. Han, Z. Ji, C. , and Y. , Enzyme family classification by support vector machines, Proteins: Structure, Function, and Bioinformatics, vol.55, issue.1, pp.66-76, 2004.

C. Cai, L. Han, Z. L. Ji, X. Chen, C. et al., Svm-prot: web-based support vector machine software for functional classification of a protein from its primary sequence, Nucleic acids research, vol.31, issue.13, pp.3692-3697, 2003.

Y. Cai and K. Chou, Predicting enzyme subclass by functional domain composition and pseudo amino acid composition, Journal of Proteome Research, vol.4, issue.3, pp.967-971, 2005.

K. Chou, Pseudo amino acid composition and its applications in bioinformatics, proteomics and system biology, Current Proteomics, vol.6, issue.4, pp.262-274, 2009.

A. Cornish-bowden, Current IUBMB recommendations on enzyme nomenclature and kinetics. Perspectives in Science, vol.1, pp.74-87, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01494614

A. Dalkiran, A. S. Rifaioglu, M. J. Martin, R. Cetin-atalay, V. Atalay et al., ECPred: a tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature, BMC Bioinformatics, vol.19, issue.1, p.334, 2018.

M. Des-jardins, P. D. Karp, M. Krummenacker, T. J. Lee, and C. A. Ouzounis, Prediction of enzyme classification from protein sequence without the use of sequence similarity, Proc Int Conf Intell Syst Mol Biol, vol.5, pp.92-99, 1997.

P. D. Dobson and A. J. Doig, Predicting enzyme class from protein structure without alignments, Journal of molecular biology, vol.345, issue.1, pp.187-199, 2005.

R. D. Finn, J. Clements, and S. R. Eddy, HMMER web server: interactive sequence similarity searching, Nucleic Acids Research, vol.39, issue.2, pp.29-37, 2011.

L. Fu, B. Niu, Z. Zhu, S. Wu, L. et al., Cdhit: accelerated for clustering the next-generation sequencing data, Bioinformatics, vol.28, issue.23, pp.3150-3152, 2012.

A. Gattiker, K. Michoud, C. Rivoire, A. H. Auchincloss, E. Coudert et al., Automated annotation of microbial proteomes in SWISS-PROT, Computational Biology and Chemistry, vol.27, issue.1, pp.49-58, 2003.

W. Huang, H. Chen, S. Hwang, and S. Ho, Accurate prediction of enzyme subfamily class using an adaptive fuzzy k-nearest neighbor method, Biosystems, vol.90, issue.2, pp.405-413, 2007.

P. Jones, D. Binns, H. Chang, M. Fraser, W. Li et al., Interproscan 5: genomescale protein function classification, Bioinformatics, vol.30, issue.9, pp.1236-1240, 2014.

A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jégou et al., Fasttext.zip: Compressing text classification models, 2016.

A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov, Bag of tricks for efficient text classification, Proceedings of the 15th Conference of the European Chapter, vol.2, pp.427-431, 2017.

D. Kimothi, A. Soni, P. Biyani, and J. M. Hogan, Distributed representations for biological sequence analysis, 2016.

E. Kretschmann, W. Fleischmann, and R. Apweiler, Automatic rule generation for protein annotation with the C4.5 data mining algorithm applied on SWISS-PROT, Bioinformatics, vol.17, pp.920-926, 2001.

N. Kumar and J. Skolnick, Eficaz2. 5: application of a high-precision enzyme function predictor to 396 proteomes, Bioinformatics, vol.28, issue.20, pp.2687-2688, 2012.

S. K. Kummerfeld and S. A. Teichmann, Protein domain organisation: adding order, BMC Bioinformatics, vol.10, issue.1, p.39, 2009.

Y. Li, S. Wang, R. Umarov, B. Xie, M. Fan et al., DEEPre: sequence-based enzyme EC number prediction by deep learning, Bioinformatics, vol.34, issue.5, pp.760-769, 2018.

Y. H. Li, J. Y. Xu, L. Tao, X. F. Li, S. Li et al., Svm-prot 2016: a web-server for machine learning prediction of protein functional families from sequence irrespective of similarity, PloS one, vol.11, issue.8, p.155290, 2016.

L. Lu, Z. Qian, Y. Cai, L. , and Y. , ECS: an automatic enzyme classifier based on functional domain composition, Computational Biology and Chemistry, vol.31, issue.3, pp.226-232, 2007.

S. Matsuda, J. Vert, H. Saigo, N. Ueda, H. Toh et al., A novel representation of protein sequences for prediction of subcellular location using support vector machines, Protein Science, vol.14, issue.11, pp.2804-2813, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00433582

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, Distributed representations of words and phrases and their compositionality, Advances in neural information processing systems, pp.3111-3119, 2013.

A. L. Mitchell, T. K. Attwood, P. C. Babbitt, M. Blum, P. Bork et al., Interpro in 2019: improving coverage, classification and access to protein sequence annotations. Nucleic acids research, vol.47, pp.351-360, 2018.

N. N. Nagao-chioko and M. Kenji, Prediction of detailed enzyme functions and identification of specificity determining residues by random forests, PLoS One, issue.1, p.9, 2014.

E. Nasibov and C. Kandemir-cavas, Efficiency analysis of KNN and minimum distance-based classifiers in enzyme family prediction, Computational Biology and Chemistry, vol.33, issue.6, pp.461-464, 2009.

S. Quester and D. Schomburg, EnzymeDetector: an integrated enzyme function prediction tool and database, BMC Bioinformatics, vol.12, issue.1, p.376, 2011.

E. Quevillon, V. Silventoinen, S. Pillai, N. Harte, N. Mulder et al., InterProScan: protein domains identifier, Nucleic Acids Research, vol.33, issue.2, pp.116-120, 2005.

S. A. Rahman, S. M. Cuesta, N. Furnham, G. L. Holliday, and J. M. Thornton, EC-BLAST: a tool to automatically search and compare enzyme reactions, Nature Methods, vol.11, issue.2, p.171, 2014.

H. Shen and K. Chou, Ezypred: a top-down approach for predicting enzyme functional classes and subclasses, Biochemical and biophysical research communications, vol.364, issue.1, pp.53-59, 2007.

, UniProt: a hub for protein information, The UniProt Consortium, vol.43, pp.204-212, 2015.

J. Yang, R. Yan, A. Roy, D. Xu, J. Poisson et al., The i-tasser suite: protein structure and function prediction, Nature methods, vol.12, issue.1, p.7, 2015.

C. Yu, N. Zavaljevski, V. Desai, and J. Reifman, , 2009.

, Genome-wide enzyme annotation with precision control: Catalytic families (CatFam) databases, Proteins: Structure, Function, and Bioinformatics, vol.74, issue.2, pp.449-460