, Same test episode of MNIST REVEAL game realised by a Baseline CNN agent

, Random search on finding high-performance algorithms in test time, CEP meta-dataset: RL vs, vol.87

, 89 6.12 Other meta-datasets: RL vs. ACTIVMETAL and Random search, p.91

. .. Performance, , p.93

M. Relating, R. Pomdp, and . .. Pomdp, , vol.102

M. Relating, R. Pomdp, and . .. Reveal, , vol.102

M. Relating, R. Pomdp, and . .. Bandits, , vol.102

. Kaggle, , pp.2018-2024

C. Andrieu, N. D. Freitas, and A. Doucet, Sequential MCMC for Bayesian model selection, IEEE Signal Processing Workshop on Higher-Order Statistics, pp.130-134, 1999.

F. Assunção, N. Lourenço, P. Machado, and B. Ribeiro, Denser: Deep evolutionary network structured representation, 2018.

B. Baker, O. Gupta, N. Naik, and R. Raskar, Designing neural network architectures using reinforcement learning, 2016.

R. Bardenet, M. Brendel, B. Kégl, and M. Sebag, Collaborative hyperparameter tuning, Proceedings of the 30th International Conference on Machine Learning (ICML), vol.28, pp.199-207, 2013.
URL : https://hal.archives-ouvertes.fr/in2p3-00907381

R. Bardenet, M. Brendel, B. Kégl, and M. Sebag, Collaborative hyperparameter tuning, Proceedings of the 30th International Conference on Machine Learning (ICML), pp.199-207, 2013.
URL : https://hal.archives-ouvertes.fr/in2p3-00907381

R. Bardenet, M. Brendel, B. Kégl, and M. Sebag, Collaborative hyperparameter tuning, 30th International Conference on Machine Learning, vol.28, pp.199-207, 2013.
URL : https://hal.archives-ouvertes.fr/in2p3-00907381

Y. Bengio, A. Courville, and P. Vincent, Representation learning: A review and new perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8, pp.1798-1828, 2013.

K. P. Bennett, G. Kunapuli, J. Hu, and J. P. , Bilevel optimization and machine learning, Computational Intelligence: Research Frontiers, vol.5050, pp.25-47, 2008.

J. Bergstra and Y. Bengio, Random search for hyper-parameter optimization, Journal of Machine Learning Research, vol.13, pp.281-305, 2012.

J. Bergstra and Y. Bengio, Random search for hyper-parameter optimization, Journal of Machine Learning Research, vol.13, pp.281-305, 2012.

J. Bergstra, D. Yamins, and D. D. Cox, Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures, 30th International Conference on Machine Learning, vol.28, pp.115-123, 2013.

J. S. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl, Algorithms for hyper-parameter optimization, Advances in Neural Information Processing Systems, pp.2546-2554, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00642998

J. S. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl, Algorithms for hyper-parameter optimization, Advances in neural information processing systems, pp.2546-2554, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00642998

A. L. Blum and P. Langley, Selection of relevant features and examples in machine learning, Artificial Intelligence, vol.97, issue.1-2, pp.273-324, 1997.

M. Boullé, Compression-based averaging of selective naive bayes classifiers, Journal of Machine Learning Research, vol.8, pp.1659-1685, 2007.

M. Boullé, A parameter-free classification method for large scale learning, Journal of Machine Learning Research, vol.10, pp.1367-1385, 2009.

P. Brazdil, C. G. Carrier, C. Soares, and R. Vilalta, Metalearning: applications to data mining, 2008.

L. Breiman, Random forests, Machine Learning, vol.45, pp.5-32, 2001.

E. Brochu, V. M. Cora, D. Freitas, and N. , A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning, 2010.

R. Caruana, A. Niculescu-mizil, G. Crew, and A. Ksikes, Ensemble selection from libraries of models, 21st International Conference on Machine Learning, p.18, 2004.

G. C. Cawley and N. L. Talbot, Preventing over-fitting during model selection via Bayesian regularisation of the hyper-parameters, Journal of Machine Learning Research, vol.8, pp.841-861, 2007.

B. Colson, P. Marcotte, and G. Savard, An overview of bilevel programming, Annals of Operations Research, vol.153, pp.235-256, 2007.

E. D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, and Q. V. Le, Autoaugment: Learning augmentation policies from data, 2018.

S. Dempe, Foundations of bilevel programming, 2002.

T. G. Dietterich, Approximate statistical test for comparing supervised classification learning algorithms, Neural Computation, vol.10, issue.7, pp.1895-1923, 1998.

R. O. Duda, P. E. Hart, and D. G. Stork, , 2001.

B. Efron, Estimating the error rate of a prediction rule: Improvement on cross-validation, Journal of the American Statistical Association, vol.78, issue.382, pp.316-331, 1983.

K. Eggensperger, M. Feurer, F. Hutter, J. Bergstra, J. Snoek et al., Towards an empirical foundation for assessing bayesian optimization of hyperparameters, NIPS workshop on Bayesian Optimization in Theory and Practice, 2013.

H. J. Escalante, M. Montes, and L. E. Sucar, Particle swarm model selection, Journal of Machine Learning Research, vol.10, pp.405-440, 2009.

S. Falkner, A. Klein, and F. Hutter, Combining hyperband and bayesian optimization, BayesOpt 2017 NIPS Workshop on Bayesian Optimization, 2017.

S. Falkner, A. Klein, and F. Hutter, Bohb: Robust and efficient hyperparameter optimization at scale, 2018.

S. Falkner, A. Klein, and F. Hutter, Practical hyperparameter optimization, International Conference on Learning Representations 2018 Workshop track, 2018.

M. Feurer, A. Klein, K. Eggensperger, J. Springenberg, M. Blum et al., Efficient and robust automated machine learning, Proceedings of the Neural Information Processing Systems, pp.2962-2970, 2015.

M. Feurer, A. Klein, K. Eggensperger, J. Springenberg, M. Blum et al., Efficient and Robust Automated Machine Learning, Advances in Neural Information Processing Systems, vol.28, pp.2962-2970, 2015.

M. Feurer, A. Klein, K. Eggensperger, J. Springenberg, M. Blum et al., Methods for improving bayesian optimization for automl, Proceedings of the International Conference on Machine Learning, 2015.

M. Feurer, J. Springenberg, and F. Hutter, Initializing bayesian hyperparameter optimization via meta-learning, Proceedings of the AAAI Conference on Artificial Intelligence, pp.1128-1135, 2015.

J. H. Friedman, Greedy function approximation: A gradient boosting machine, The Annals of Statistics, vol.29, issue.5, pp.1189-1232, 2001.

N. Fusi, R. Sheth, and M. Elibol, Probabilistic matrix factorization for automated machine learning, Advances in Neural Information Processing Systems, pp.3352-3361, 2018.

S. Geman, E. Bienenstock, and R. Doursat, Neural networks and the bias/variance dilemma, Neural Computation, vol.4, issue.1, pp.1-58, 1992.

G. H. Golub and C. Reinsch, Singular value decomposition and least squares solutions, Linear Algebra, pp.134-151, 1971.

I. Guyon, K. Bennett, G. Cawley, H. J. Escalante, S. Escalera et al., Design of the 2015 chalearn automl challenge, Neural Networks (IJCNN), 2015 International Joint Conference on, pp.1-8, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01381164

I. Guyon, K. Bennett, G. Cawley, H. J. Escalante, S. Escalera et al., AutoML challenge 2015: Design and first results, Proc. of AutoML, 2015.

I. Guyon, S. Gunn, M. Nikravesh, and L. Zadeh, Feature extraction, foundations and applications, Studies in Fuzziness and Soft Computing, 2006.

. Guyon, Analysis of the AutoML challenge series, Springer series in Challanges in Machine Learning, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01906197

M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann et al., The Weka data mining software: An update, SIGKDD Explor. Newsl, vol.11, issue.1, pp.10-18, 2009.

T. Hastie, S. Rosset, R. Tibshirani, and J. Zhu, The entire regularization path for the support vector machine, Journal of Machine Learning Research, vol.5, pp.1391-1415, 2004.

T. Hastie, R. Tibshirani, and J. Friedman, The elements of statistical learning: Data mining, inference, and prediction, 2001.

P. Hennig and C. J. Schuler, Entropy search for information-efficient global optimization, Journal of Machine Learning Research, vol.13, pp.1809-1837, 2012.

[. Hutter, Sequential Modelbased Algorithm Configuration (SMAC)

F. Hutter, H. H. Hoos, and K. Leyton-brown, Sequential modelbased optimization for general algorithm configuration, Proc. of LION-5, pp.507-523, 2011.

F. Hutter, H. H. Hoos, and K. Leyton-brown, Sequential model-based optimization for general algorithm configuration, Proceedings of the conference on Learning and Intelligent OptimizatioN, 2011.

F. Hutter, H. H. Hoos, and K. Leyton-brown, Sequential model-based optimization for general algorithm configuration, International Conference on Learning and Intelligent Optimization, pp.507-523, 2011.

F. Hutter, H. H. Hoos, and K. Leyton-brown, Sequential model-based optimization for general algorithm configuration, editor, the 5th Learning and Intelligent Optimization Conference (LION), vol.6683, pp.507-523, 2011.

F. Hutter, H. H. Hoos, and K. Leyton-brown, Sequential Modelbased Optimization for General Algorithm Configuration, Proceedings of the 5th International Conference on Learning and Intelligent Optimization, LION'05, pp.507-523, 2011.

F. Hutter, L. Kotthoff, and J. Vanschoren, Automatic machine learning: methods, systems, challenges, Machine Learning, 2019.

J. P. Ioannidis, Why most published research findings are false, PLoS Medicine, vol.2, issue.8, p.124, 2005.

M. I. Jordan, On statistics, computation and scalability, Bernoulli, vol.19, issue.4, pp.1378-1390, 2013.

S. M. Kakade, A natural policy gradient, Advances in neural information processing systems, pp.1531-1538, 2002.

S. S. Keerthi, V. Sindhwani, C. , and O. , An efficient method for gradient-based adaptation of hyperparameters in SVM models, Advances in Neural Information Processing Systems, 2007.

R. D. King, C. Feng, and A. Sutherland, Statlog: comparison of classification algorithms on large real-world problems, Applied Artificial Intelligence an International Journal, vol.9, issue.3, pp.289-333, 1995.

A. Klein, S. Falkner, S. Bartels, P. Hennig, and F. Hutter, Fast bayesian hyperparameter optimization on large datasets, In Electronic Journal of Statistics, vol.11, 2017.

R. Kohavi and G. H. John, Wrappers for feature selection, Artificial Intelligence, vol.97, issue.1-2, pp.273-324, 1997.

B. Komer, J. Bergstra, and C. Eliasmith, Hyperopt-sklearn: automatic hyperparameter configuration for scikit-learn, ICML workshop on AutoML, pp.2825-2830, 2014.

V. R. Konda and J. N. Tsitsiklis, Actor-critic algorithms, Advances in neural information processing systems, pp.1008-1014, 2000.

J. Langford, Clever methods of overfitting, 2005.

Y. Lecun and C. Cortes, MNIST handwritten digit database, 2010.

L. Li, K. Jamieson, G. Desalvo, A. Rostamizadeh, and A. Talwalkar, Hyperband: A novel bandit-based approach to hyperparameter optimization, 2016.

L. Li, K. Jamieson, G. Desalvo, A. Rostamizadeh, and A. Talwalkar, Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization, 2016.

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez et al., Continuous control with deep reinforcement learning, 2015.

Z. Liu, I. Guyon, J. J. Junior, M. Madadi, S. Escalera et al., AutoCV Challenge Design and Baseline Results, CAp 2019 -Conférence sur l'Apprentissage Automatique, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02265053

J. Lloyd, Freeze Thaw Ensemble Construction, 2016.

R. W. Lutz, Logitboost with trees applied to the WCCI 2006 performance prediction challenge datasets, Proc. IJCNN06, pp.2966-2969, 2006.

M. M?s?r and M. Sebag, Alors: An algorithm recommender system, Artificial Intelligence, vol.244, pp.291-314, 2017.

V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou et al., Playing atari with deep reinforcement learning, 2013.

J. Mockus, V. Tiesis, and A. Zilinskas, The application of bayesian methods for seeking the extremum, Towards global optimization, vol.2, p.2, 1978.

M. Momma and K. P. Bennett, A pattern search method for model selection of support vector regression, Proceedings of the SIAM International Conference on Data Mining. SIAM, 2002.

G. Moore, C. Bergeron, and K. P. Bennett, Model selection for primal SVM, Machine Learning, vol.85, pp.1-2, 2011.

G. M. Moore, C. Bergeron, and K. P. Bennett, Nonsmooth bilevel programming for hyperparameter selection, IEEE International Conference on Data Mining Workshops, pp.374-381, 2009.

M. Muja and D. Lowe, Fast library for approximate nearest neighbors (flann), 2013.

M. A. Muñoz, L. Villanova, D. Baatar, and K. Smith-miles, Instance spaces for machine learning classification, Machine Learning, vol.107, pp.109-147, 2018.

M. Opper and O. Winther, Gaussian processes and SVM: Mean field results and leave-one-out, pp.43-65, 2000.

S. J. Pan and Q. Yang, A survey on transfer learning, IEEE Transactions on Knoweledge and Data Engineering, vol.22, issue.10, pp.1345-1359, 2010.

M. Y. Park and T. Hastie, L1-regularization path algorithm for generalized linear models, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.69, issue.4, pp.659-677, 2007.

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion et al., , 2011.

, Scikit-learn: Machine learning in python, The Journal of Machine Learning Research, vol.12, pp.2825-2830

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion et al., Scikit-learn: Machine learning in Python, Journal of Machine Learning Research, vol.12, pp.2825-2830, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00650905

H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean, Efficient neural architecture search via parameter sharing, 2018.

H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean, Efficient Neural Architecture Search via Parameter Sharing, 2018.

C. E. Rasmussen, Gaussian processes in machine learning, Advanced lectures on machine learning, pp.63-71, 2004.

E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu et al., Large-scale evolution of image classifiers, Proceedings of the 34th International Conference on Machine Learning, vol.70, pp.2902-2911, 2017.

J. Rennie and N. Srebro, Fast maximum margin matrix factorization for collaborative prediction, Proceedings of the 22nd international conference on Machine learning, pp.713-719, 2005.

J. Rissanen, Modeling by shortest data description, Automatica, vol.14, issue.5, pp.465-471, 1978.

B. Schölkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, 2001.

J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, Trust region policy optimization, International conference on machine learning, pp.1889-1897, 2015.

J. Snoek, H. Larochelle, and R. P. Adams, Practical Bayesian optimization of machine learning algorithms, Advances in Neural Information Processing Systems, vol.25, pp.2951-2959, 2012.

N. Srebro, J. Rennie, and T. Jaakkola, Maximum-margin matrix factorization, Advances in neural information processing systems, vol.17, pp.1329-1336, 2005.

A. Statnikov, L. Wang, A. , and C. F. , A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification, BMC Bioinformatics, issue.1, p.9, 2008.

D. Stern, R. Herbrich, T. Graepel, H. Samulowitz, L. Pulina et al., Collaborative expert portfolio management, AAAI, pp.179-184, 2010.

Q. Sun, B. Pfahringer, and M. Mayo, Full model selection in the space of data mining operators, Genetic and Evolutionary Computation Conference, pp.1503-1504, 2012.

L. Sun-hosoya, I. Guyon, and M. Sebag, Activmetal: Algorithm recommendation with active meta learning, IAL 2018 workshop, ECML PKDD, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01931262

L. Sun-hosoya, I. Guyon, and M. Sebag, Lessons learned from the automl challenge, Conférence sur l'Apprentissage Automatique, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01811454

R. S. Sutton and A. G. Barto, Reinforcement learning: an introduction. Adaptive computation and machine learning series, 2018.

R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction, 2018.

K. Swersky, J. Snoek, and R. P. Adams, Multi-task Bayesian optimization, Advances in Neural Information Processing Systems, vol.26, pp.2004-2012, 2013.

K. Swersky, J. Snoek, and R. P. Adams, Freeze-thaw bayesian optimization, 2014.

A. Thakur and A. Krohn-grimberghe, Autocompete: A framework for machine learning competitions, AutoML Workshop, International Conference on Machine Learning, 2015.

C. Thornton, F. Hutter, H. H. Hoos, and K. Leyton-brown, Auto-weka: Automated selection and hyper-parameter optimization of classification algorithms, 2012.

C. Thornton, F. Hutter, H. H. Hoos, K. Leyton-brown, I. S. Dhillon et al., Autoweka: combined selection and hyperparameter optimization of classification algorithms, The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (KDD), pp.847-855, 2013.

C. Thornton, F. Hutter, H. H. Hoos, and K. Leyton-brown, Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms, Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp.847-855, 2013.

C. Thornton, F. Hutter, H. H. Hoos, and K. Leyton-brown, Autoweka: Combined selection and hyperparameter optimization of classification algorithms, 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.847-855, 2013.

E. Tuv, A. Borisov, G. Runger, and K. Torkkola, Feature selection with ensembles, artificial variables, and redundancy elimination, Journal of Machine Learning Research, vol.10, pp.1341-1366, 2009.

H. Van-hasselt, A. Guez, and D. Silver, Deep reinforcement learning with double q-learning, Thirtieth AAAI conference on artificial intelligence, 2016.

J. Vanschoren, J. N. Van-rijn, B. Bischl, and L. Torgo, Openml: Networked science in machine learning, SIGKDD Explorations, vol.15, issue.2, pp.49-60, 2013.

J. Vanschoren, J. N. Van-rijn, B. Bischl, and L. Torgo, OpenML: networked science in machine learning, ACM SIGKDD Explorations Newsletter, vol.15, issue.2, pp.49-60, 2014.

V. Vapnik and O. Chapelle, Bounds on error expectation for support vector machines, Neural computation, vol.12, issue.9, pp.2013-2036, 2000.

V. N. Vapnik, Statistical learning theory, 1998.

E. M. Voorhees, Overview of the trec 2001 question answering track, TREC, pp.42-51, 2001.

C. J. Watkins, Learning from delayed rewards, 1989.

M. Weimer, A. Karatzoglou, Q. Le, and A. Smola, CofiRankmaximum margin matrix factorization for collaborative ranking, Proceedings of the 21st Annual Conference on Neural Information Processing Systems (NIPS), pp.222-230, 2007.

R. J. Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine learning, vol.8, issue.3-4, pp.229-256, 1992.

A. Zhang, N. Ballas, and J. Pineau, A dissection of overfitting and generalization in continuous reinforcement learning, 2018.

B. Zoph and Q. V. Le, Neural architecture search with reinforcement learning, 2016.

B. Zoph and Q. V. Le, Neural Architecture Search with Reinforcement Learning, 2016.

, AutoML est généralement traité comme un problème de sélection d'algorithme / hyper-paramètre. Les approches existantes incluent l'optimisation Bayésienne, les algorithmes évolutionnistes et l'apprentissage par renforcement. Parmi eux, auto-sklearn, qui intègre des techniques de meta-learning à l'initialisation de la recherche, occupe toujours une place de choix dans les challenges AutoML. Cette observation a orienté mes recherches vers le domaine du meta-learning, Titre: Meta-Learning en tant que Processus de Décision Markovien Mots clés: apprentissage automatisé, meta-learing, processus de décision markovien Résumé: L'apprentissage automatique (ML) a connu d'énormes succès ces dernières années et repose sur un nombre toujours croissant d'applications réelles

, ACTIVMETAL: un nouvel algorithme pour le meta-learning actif (ACTIVMETAL) a été conçu, proposant une solution glouton au problème meta-learning. (3) REVEAL: Une nouvelle conceptualisation du meta-learning en tant que processus de décision Markovien a été dévelopée et intégrée dans le cadre plus général des jeux REVEAL. Les solutions basée sur l'apprentissage par renforcement ont été proposées. Le travail présenté dans cette thèse est de nature empirique. Plusieurs méta-données du monde réel ont été utilisées dans cette recherche. Des méta-données artificielles et semiartificielles sont également utilisées dans mon travail. Les résultats indiquent que RL est une approche viable du problème de meta-learning, Les principaux résultats de cette thèse sont: (1) Sélection HP / modèle: La méthode Freeze-Thaw a été exploré pour entrer dans le premier challenge AutoML, obtenant la 3ème place du tour final