A. Ahmed, K. Yu, W. Xu, Y. Gong, and E. Xing, Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks, ECCV, 2008.
DOI : 10.1007/978-3-540-88690-7_6

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.141.6514

Y. Aytar and A. Zisserman, Tabula rasa: Model transfer for object category detection, 2011 International Conference on Computer Vision, 2011.
DOI : 10.1109/ICCV.2011.6126504

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.422.4758

Y. Boureau, F. Bach, Y. Lecun, and J. Ponce, Learning midlevel features for recognition, CVPR, 2010.

R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu et al., Natural language processing (almost) from scratch, JMLR, vol.12, pp.2493-2537, 2006.

G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray, Visual categorization with bags of keypoints, ECCV Workshop, 2004.

N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005.
DOI : 10.1109/CVPR.2005.177

URL : https://hal.archives-ouvertes.fr/inria-00548512

J. Deng, W. Dong, R. Socher, L. Li, K. Li et al., ImageNet: A Large-Scale Hierarchical Image Database, CVPR, 2009.

J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang et al., Decaf: A deep convolutional activation feature for generic visual recognition, p.7, 2013.

M. Everingham, L. Van-gool, C. K. Williams, J. Winn, and A. Zisserman, The Pascal Visual Object Classes (VOC) Challenge, International Journal of Computer Vision, vol.73, issue.2, pp.303-338, 2010.
DOI : 10.1007/s11263-009-0275-4

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.167.6629

C. Farabet, C. Couprie, L. Najman, and Y. Lecun, Learning Hierarchical Features for Scene Labeling, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8, p.5
DOI : 10.1109/TPAMI.2012.231

URL : https://hal.archives-ouvertes.fr/hal-00742077

A. Farhadi, M. K. Tabrizi, I. Endres, and D. Forsyth, A latent model of discriminative aspect, 2009 IEEE 12th International Conference on Computer Vision, 2009.
DOI : 10.1109/ICCV.2009.5459350

P. Felzenszwalb, R. Girshick, D. Mcallester, and D. Ramanan, Object Detection with Discriminatively Trained Part-Based Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, issue.9, pp.1627-1645, 2010.
DOI : 10.1109/TPAMI.2009.167

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.153.2745

P. Felzenszwalb, D. Mcallester, and D. Ramanan, A discriminatively trained, multiscale, deformable part model, 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008.
DOI : 10.1109/CVPR.2008.4587597

K. Fukushima, Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position, Biological Cybernetics, vol.40, issue.4, pp.193-202, 1980.
DOI : 10.1007/BF00344251

R. Girshick, J. Donahue, T. Darrell, and J. Malik, Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.7
DOI : 10.1109/CVPR.2014.81

G. Griffin, A. Holub, and P. Perona, Caltech-256 object category dataset, 2007.

G. E. Hinton, Learning multiple layers of representation, Trends in Cognitive Sciences, vol.11, issue.10, pp.428-434, 2007.
DOI : 10.1016/j.tics.2007.09.004

D. H. Hubel and T. N. , Receptive fields of single neurones in the cat's striate cortex, The Journal of Physiology, vol.148, issue.3, pp.574-591, 1959.
DOI : 10.1113/jphysiol.1959.sp006308

J. Jiang and C. Zhai, Instance weighting for domain adaptation in NLP, ACL, 2007.

M. Juneja, A. Vedaldi, C. V. Jawahar, and A. Zisserman, Blocks That Shout: Distinctive Parts for Scene Classification, 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013.
DOI : 10.1109/CVPR.2013.124

A. Khosla, T. Zhou, T. Malisiewicz, A. A. Efros, and A. Torralba, Undoing the Damage of Dataset Bias, ECCV, 2012.
DOI : 10.1007/978-3-642-33718-5_12

A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, NIPS, 2007.

K. J. Lang and G. E. Hinton, A time delay neural network architecture for speech recognition, 1988.

S. Lazebnik, C. Schmid, and J. Ponce, Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2 (CVPR'06), 2006.
DOI : 10.1109/CVPR.2006.68

URL : https://hal.archives-ouvertes.fr/inria-00548585

Q. Le, M. Ranzato, R. Monga, M. Devin, K. Chen et al., Building high-level features using large scale unsupervised learning, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, p.3
DOI : 10.1109/ICASSP.2013.6639343

URL : http://arxiv.org/abs/1112.6209

Q. Le, W. Zou, S. Yeung, and A. Ng, Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995496

Y. Lecun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard et al., Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, vol.1, issue.4, pp.541-551, 1989.
DOI : 10.1007/BF00133697

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradientbased learning applied to document recognition, PIEEE, vol.86, issue.11 1, pp.2278-2324, 1998.
DOI : 10.1109/5.726791

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.138.1115

Y. Lecun, L. Bottou, and J. Huangfu, Learning methods for generic object recognition with invariance to pose and lighting, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., 2004.
DOI : 10.1109/CVPR.2004.1315150

D. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, vol.60, issue.2, pp.91-110, 2004.
DOI : 10.1023/B:VISI.0000029664.99615.94

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.4931

R. Osadchy, M. Miller, and Y. Lecun, Synergistic Face Detection and Pose Estimation with Energy-Based Models, NIPS, 2005.
DOI : 10.1007/11957959_10

S. Pan and Q. Yang, A survey on transfer learning. Knowledge and Data Engineering, IEEE Transactions on, vol.22, issue.10 2, pp.1345-1359, 2010.

F. Perronnin, J. Sánchez, and T. Mensink, Improving the Fisher Kernel for Large-Scale Image Classification, ECCV, 2010.
DOI : 10.1007/978-3-642-15561-1_11

URL : https://hal.archives-ouvertes.fr/inria-00548630

H. Pirsiavash and D. Ramanan, Detecting activities of daily living in first-person camera views, 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012.
DOI : 10.1109/CVPR.2012.6248010

X. Ren and D. Ramanan, Histograms of Sparse Codes for Object Detection, 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013.
DOI : 10.1109/CVPR.2013.417

F. Rosenblatt, The perceptron: A perceiving and recognizing automaton, 1957.

D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning representations by back-propagating errors, Nature, vol.85, issue.6088, pp.533-536, 1986.
DOI : 10.1038/323533a0

K. Saenko, B. Kulis, M. Fritz, and T. Darrell, Adapting Visual Category Models to New Domains, ECCV, 2010.
DOI : 10.1007/978-3-642-15561-1_16

URL : http://hdl.handle.net/11858/00-001M-0000-0017-E577-9

P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus et al., Overfeat: Integrated recognition, localization and detection using convolutional networks, p.7, 2013.

P. Simard, D. Steinkraus, and J. C. Platt, Best practices for convolutional neural networks applied to visual document analysis, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings., pp.958-962, 2003.
DOI : 10.1109/ICDAR.2003.1227801

S. Singh, A. Gupta, and A. A. Efros, Unsupervised Discovery of Mid-Level Discriminative Patches, ECCV, 2012.
DOI : 10.1007/978-3-642-33709-3_6

J. Sivic and A. Zisserman, Video Google: a text retrieval approach to object matching in videos, Proceedings Ninth IEEE International Conference on Computer Vision, 2003.
DOI : 10.1109/ICCV.2003.1238663

Z. Song, Q. Chen, Z. Huang, Y. Hua, and S. Yan, Contextualizing object detection and classification, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995330

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.660.6015

G. W. Taylor, R. Fergus, Y. Lecun, and C. Bregler, Convolutional Learning of Spatio-temporal Features, ECCV, 2010.
DOI : 10.1007/978-3-642-15567-3_11

T. Tommasi, F. Orabona, and B. Caputo, Safety in numbers: Learning categories from few examples with multi model knowledge transfer, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010.
DOI : 10.1109/CVPR.2010.5540064

A. Torralba and A. A. Efros, Unbiased look at dataset bias, CVPR 2011, 2004.
DOI : 10.1109/CVPR.2011.5995347

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.208.2314

R. Vaillant, C. Monrocq, and Y. Lecun, Original approach for the localisation of objects in images, IEE Proceedings - Vision, Image, and Signal Processing, vol.141, issue.4, pp.245-250, 1994.
DOI : 10.1049/ip-vis:19941301

S. Yan, J. Dong, Q. Chen, Z. Song, Y. Pan et al., Generalized hierarchical matching for sub-category aware object classification, Visual Recognition Challange workshop, ECCV, 2012.

M. Zeiler and R. Fergus, Visualizing and Understanding Convolutional Networks, p.7, 2013.
DOI : 10.1007/978-3-319-10590-1_53

URL : http://arxiv.org/abs/1311.2901

M. Zeiler, G. Taylor, and R. Fergus, Adaptive deconvolutional networks for mid and high level feature learning, 2011 International Conference on Computer Vision
DOI : 10.1109/ICCV.2011.6126474

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.227.7393