C. Wang, W. Ren, K. Huang, and T. Tan, Weakly Supervised Object Localization with Latent Category Learning, pp.431-445, 2014.
DOI : 10.1007/978-3-319-10599-4_28

R. G. Cinbis, J. Verbeek, and C. Schmid, Weakly supervised object localization with multi-fold multiple instance learning. arXiv preprint arXiv:1503, p.949, 2015.
DOI : 10.1109/tpami.2016.2535231

URL : https://hal.archives-ouvertes.fr/hal-01123482

Y. Lecun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard et al., Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, vol.1, issue.4, pp.541-551, 1989.
DOI : 10.1007/BF00133697

A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, In: NIPS, pp.1097-1105, 2012.

M. Oquab, L. Bottou, I. Laptev, and J. Sivic, Is object localization for free?-weaklysupervised learning with convolutional neural networks, In: CVPR, pp.685-694, 2015.
DOI : 10.1109/cvpr.2015.7298668

URL : https://hal.archives-ouvertes.fr/hal-01015140

H. Bilen and A. Vedaldi, Weakly Supervised Deep Detection Networks, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
DOI : 10.1109/CVPR.2016.311

URL : http://arxiv.org/abs/1511.02853

R. Girshick, J. Donahue, T. Darrell, and J. Malik, Region-Based Convolutional Networks for Accurate Object Detection and Segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.38, issue.1, pp.142-158, 2016.
DOI : 10.1109/TPAMI.2015.2437384

S. Ren, K. He, R. Girshick, and J. Sun, Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, pp.91-99, 2015.
DOI : 10.1109/TPAMI.2016.2577031

URL : http://arxiv.org/abs/1506.01497

S. Gidaris and N. Komodakis, Object Detection via a Multi-region and Semantic Segmentation-Aware CNN Model, 2015 IEEE International Conference on Computer Vision (ICCV), pp.1134-1142, 2015.
DOI : 10.1109/ICCV.2015.135

A. Torralba, K. P. Murphy, W. T. Freeman, and M. A. Rubin, Context-based vision system for place and object recognition, Proceedings Ninth IEEE International Conference on Computer Vision, pp.273-280, 2003.
DOI : 10.1109/ICCV.2003.1238354

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.3790

A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora, and S. Belongie, Objects in Context, 2007 IEEE 11th International Conference on Computer Vision, pp.1-8, 2007.
DOI : 10.1109/ICCV.2007.4408986

P. F. Felzenszwalb, R. B. Girshick, D. Mcallester, and D. Ramanan, Object Detection with Discriminatively Trained Part-Based Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, issue.9, pp.1627-1645, 2010.
DOI : 10.1109/TPAMI.2009.167

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.153.2745

C. Desai, D. Ramanan, and C. Fowlkes, Discriminative models for multi-class object layout, In: ICCV, pp.229-236, 2009.
DOI : 10.1007/s11263-011-0439-x

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.161.8585

O. Chum and A. Zisserman, An Exemplar Model for Learning Object Classes, 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2007.
DOI : 10.1109/CVPR.2007.383050

Z. Shi, P. Siva, T. Xiang, and Q. Mary, Transfer Learning by Ranking for Weakly Supervised Object Annotation, Procedings of the British Machine Vision Conference 2012, p.5, 2012.
DOI : 10.5244/C.26.78

P. Siva, C. Russell, and T. Xiang, In Defence of Negative Mining for Annotating Weakly Labelled Data, pp.594-608, 2012.
DOI : 10.1007/978-3-642-33712-3_43

T. Deselaers, B. Alexe, and V. Ferrari, Weakly Supervised Localization and Learning with Generic Knowledge, International Journal of Computer Vision, vol.73, issue.2, pp.275-293, 2012.
DOI : 10.1007/s11263-012-0538-3

P. Siva, C. Russell, T. Xiang, and L. Agapito, Looking Beyond the Image: Unsupervised Learning for Object Saliency and Detection, 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp.3238-3245, 2013.
DOI : 10.1109/CVPR.2013.416

H. O. Song, R. Girshick, S. Jegelka, J. Mairal, Z. Harchaoui et al., On learning to localize objects with minimal supervision. arXiv preprint arXiv, pp.1403-1024, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00996849

H. O. Song, Y. J. Lee, S. Jegelka, and T. Darrell, Weakly-supervised discovery of visual pattern configurations, In: NIPS, 2014.

H. Bilen, M. Pedersoli, and T. Tuytelaars, Weakly supervised object detection with posterior regularization, In: BMVC, 2014.
DOI : 10.5244/c.28.52

H. Bilen, M. Pedersoli, and T. Tuytelaars, Weakly supervised object detection with convex clustering, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1081-1089, 2015.
DOI : 10.1109/CVPR.2015.7298711

URL : https://lirias.kuleuven.be/bitstream/123456789/511404/1/3966_final_OA.pdf

B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, Learning deep features for discriminative localization. arXiv preprint, 2015.
DOI : 10.1109/cvpr.2016.319

URL : http://arxiv.org/abs/1512.04150

P. M. Long and L. Tan, PAC learning axis-aligned rectangles with respect to product distributions from multiple-instance examples, Proceedings of the ninth annual conference on Computational learning theory , COLT '96, pp.7-21, 1998.
DOI : 10.1145/238061.238105

B. Alexe, T. Deselaers, and V. Ferrari, Measuring the Objectness of Image Windows, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.11, pp.2189-2202, 2012.
DOI : 10.1109/TPAMI.2012.28

R. Girshick, Fast R-CNN, 2015 IEEE International Conference on Computer Vision (ICCV), pp.1440-1448, 2015.
DOI : 10.1109/ICCV.2015.169

A. Oliva and A. Torralba, The role of context in object recognition, Trends in Cognitive Sciences, vol.11, issue.12, pp.520-527, 2007.
DOI : 10.1016/j.tics.2007.09.009

O. Russakovsky, Y. Lin, K. Yu, and L. Fei-fei, Object-Centric Spatial Pooling for Image Classification, pp.1-15, 2012.
DOI : 10.1007/978-3-642-33709-3_1

C. Doersch, A. Gupta, and A. A. Efros, Context as Supervisory Signal: Discovering Objects with Predictable Context, pp.362-377, 2014.
DOI : 10.1007/978-3-319-10578-9_24

M. Cho, S. Kwak, C. Schmid, and J. Ponce, Unsupervised object discovery and localization in the wild: Part-based matching with bottom-up region proposals, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1201-1210, 2015.
DOI : 10.1109/CVPR.2015.7298724

URL : https://hal.archives-ouvertes.fr/hal-01110036

S. Kwak, M. Cho, I. Laptev, J. Ponce, and C. Schmid, Unsupervised Object Discovery and Tracking in Video Collections, 2015 IEEE International Conference on Computer Vision (ICCV), pp.3173-3181, 2015.
DOI : 10.1109/ICCV.2015.363

URL : https://hal.archives-ouvertes.fr/hal-01153017

K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman, Return of the Devil in the Details: Delving Deep into Convolutional Nets, Proceedings of the British Machine Vision Conference 2014, 2014.
DOI : 10.5244/C.28.6

J. R. Uijlings, K. E. Van-de-sande, T. Gevers, and A. W. Smeulders, Selective Search for Object Recognition, International Journal of Computer Vision, vol.57, issue.1, pp.154-171, 2013.
DOI : 10.1007/s11263-013-0620-5

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.361.3382

K. He, X. Zhang, S. Ren, and J. Sun, Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.37, issue.9, pp.1904-1916, 2015.
DOI : 10.1109/TPAMI.2015.2389824

M. Everingham, L. Van-gool, C. K. Williams, J. Winn, and A. Zisserman, The Pascal Visual Object Classes (VOC) Challenge, International Journal of Computer Vision, vol.73, issue.2, pp.303-338, 2010.
DOI : 10.1007/s11263-009-0275-4

M. Everingham, L. Van-gool, C. K. Williams, J. Winn, and A. Zisserman, The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results
DOI : 10.1007/s11263-014-0733-5

URL : https://lirias.kuleuven.be/bitstream/123456789/485215/1/3852_final.pdf

R. Collobert, K. Kavukcuoglu, and C. Farabet, Torch7: A matlab-like environment for machine learning, In: BigLearn, NIPS Workshop. Number EPFL-CONF, p.192376, 2011.

S. Gidaris and N. Komodakis, Locnet: Improving localization accuracy for object detection. arXiv preprint, 2015.
DOI : 10.1109/cvpr.2016.92

URL : https://hal.archives-ouvertes.fr/hal-01245707

S. Gidaris and N. Komodakis, Object Detection via a Multi-region and Semantic Segmentation-Aware CNN Model, 2015 IEEE International Conference on Computer Vision (ICCV), pp.1134-1142, 2015.
DOI : 10.1109/ICCV.2015.135

S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran et al., cudnn: Efficient primitives for deep learning. arXiv preprint, 2014.