J. Chang and Y. Chen, Batch-normalized maxout network in network. Arxiv preprint, 2015.

T. Chen, I. Goodfellow, and J. Shlens, Net2net: Accelerating learning via knowledge transfer, 2016.

C. Farabet, C. Couprie, L. Najman, and Y. Lecun, Learning Hierarchical Features for Scene Labeling, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8, pp.1915-1929, 2013.
DOI : 10.1109/TPAMI.2012.231

URL : https://hal.archives-ouvertes.fr/hal-00742077

A. Graves, S. Fernández, and J. Schmidhuber, Multi-dimensional Recurrent Neural Networks, 2007.
DOI : 10.1007/978-3-540-74690-4_56

K. He, X. Zhang, S. Ren, and J. Sun, Identity Mappings in Deep Residual Networks, ECCV, 2016.
DOI : 10.1007/978-3-319-46493-0_38

S. Honari, J. Yosinski, P. Vincent, and C. Pal, Recombinator Networks: Learning Coarse-to-Fine Feature Aggregation, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
DOI : 10.1109/CVPR.2016.619

G. Huang, M. Ramesh, T. Berg, and E. Learned-miller, Labeled faces in the wild: a database for studying face recognition in unconstrained environments, 2007.

S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, ICML, 2015.

A. Kae, K. Sohn, H. Lee, and E. Learned-miller, Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling, 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013.
DOI : 10.1109/CVPR.2013.263

N. Kalchbrenner, I. Danihelka, and A. Graves, Grid long short-term memory, 2016.

D. Kingma and J. Ba, Adam: A method for stochastic optimization, ICLR, 2015.

A. Krizhevsky, I. Sutskever, and G. Hinton, Imagenet classification with deep convolutional neural networks, 2012.

P. Kulkarni, J. Zepeda, F. Jurie, P. Pérez, and L. Chevallier, Learning the Structure of Deep Architectures Using L1 Regularization, Procedings of the British Machine Vision Conference 2015, 2015.
DOI : 10.5244/C.29.23

URL : https://hal.archives-ouvertes.fr/hal-01266462

Y. Lecun, B. Boser, J. Denker, D. Henderson, R. Howard et al., Handwritten digit recognition with a back-propagation network, NIPS, 1989.

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, pp.2278-2324, 1998.
DOI : 10.1109/5.726791

C. Lee, P. Gallagher, and Z. Tu, Generalizing pooling functions in convolutional neural networks: Mixed, gated, and tree, AISTATS, 2016.

M. Lin, Q. Chen, and S. Yan, Network in network, ICLR, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01460127

S. Liu, J. Yang, C. Huang, and M. Yang, Multi-objective convolutional learning for face labeling, CVPR, 2015.

J. Long, E. Shelhamer, and T. Darrell, Fully convolutional networks for semantic segmentation, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
DOI : 10.1109/CVPR.2015.7298965

D. Mishkin and J. Matas, All you need is a good init, ICLR, 2016.

I. Misra, A. Shrivastava, A. Gupta, and M. Hebert, Cross-stich networks for multi-task learning, CVPR, 2016.

H. Noh, S. Hong, and B. Han, Learning Deconvolution Network for Semantic Segmentation, 2015 IEEE International Conference on Computer Vision (ICCV), 2015.
DOI : 10.1109/ICCV.2015.178

T. Pfister, J. Charles, and A. Zisserman, Flowing ConvNets for Human Pose Estimation in Videos, 2015 IEEE International Conference on Computer Vision (ICCV), 2015.
DOI : 10.1109/ICCV.2015.222

URL : http://arxiv.org/abs/1506.02897

O. Ronneberger, P. Fischer, and T. Brox, U-Net: Convolutional Networks for Biomedical Image Segmentation, Medical Image Computing and Computer-Assisted Intervention, 2015.
DOI : 10.1007/978-3-319-24574-4_28

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, 2015.

S. Singh, D. Hoiem, and D. Forsyth, Swapout: learning an ensemble of deep architectures, NIPS, 2016.

J. Springenberg, A. Dosovitskiy, M. Brox, and . Riedmiller, Striving for simplicity: The all convolutional net, ICLR, 2015.

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Dropout: A simple way to prevent neural networks from overfitting, JMLR, 2014.

S. Tsogkas, I. Kokkinos, G. Papandreou, and A. Vedaldi, Deep learning for semantic part segmentation with high-level guidance. Arxiv preprint, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01264942

L. Wan, M. Zeiler, S. Zhang, Y. Lecun, and R. Fergus, Regularization of neural networks using DropConnect, ICML, 2013.

H. Zheng, Y. Liu, M. Ji, F. Wu, and L. Fang, Learning high-level prior with convolutional neural networks for semantic segmentation Arxiv preprint, 2015.

Y. Zhou, X. Hu, and B. Zhang, Interlinked Convolutional Neural Networks for Face Parsing, International Symposium on Neural Networks, 2015.
DOI : 10.1007/978-3-319-25393-0_25