J. Almazan, B. Gajic, N. Murray, and D. Larlus, Re-id done right: towards good practices for person reidentification, 2018.

A. Babenko and V. Lempitsky, Aggregating deep convolutional features for image retrieval, ICCV, 2015.

A. Babenko, A. Slesarev, A. Chigorin, and V. Lempitsky, Neural codes for image retrieval, ECCV, 2014.

B. Christopher, J. Choy, S. Gwak, M. Savarese, and . Chandraker, Universal correspondence network, NIPS, vol.1, p.3, 2016.

O. Chum, J. Matas, and J. Kittler, Locally optimized ransac, DAGM Symposium on Pattern Recognition, p.236, 2003.

O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman, Total recall: Automatic query expansion with a generative feature model for object retrieval, ICCV, 2002.

W. Dong, R. Socher, L. Li-jia, K. Li, and L. Fei-fei, Imagenet: A large-scale hierarchical image database, CVPR, vol.6, p.7, 2009.

A. Dosovitskiy, P. Fischer, E. Ilg, P. Hausser, C. Hazirbas et al., Flownet: Learning optical flow with convolutional networks, CVPR, pp.2758-2766, 2015.

A. Martin, R. C. Fischler, and . Bolles, Random sample consensus, Communications of ACM, vol.6, issue.24, pp.381-395, 1981.

A. Gordo, J. Almazan, J. Revaud, and D. Larlus, End-to-end learning of deep visual representations for image retrieval, IJCV, vol.124, issue.2, pp.237-254, 2008.

K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, CVPR, 2016.

A. Iscen and G. Tolias, Efficient diffusion on region manifolds: Recovering small objects with compact cnn representations, CVPR, vol.7, 2006.
URL : https://hal.archives-ouvertes.fr/hal-01505470

H. Jégou and O. Chum, Negative evidences and co-occurrences in image retrieval: The benefit of PCA and whitening, ECCV, 2012.

H. Jégou, M. Douze, C. Schmid, and P. Pérez, Aggregating local descriptors into a compact image representation, CVPR, 2001.

Y. Kalantidis, C. Mellina, and S. Osindero, Cross-dimensional weighting for aggregated deep convolutional features, Computer Vision -ECCV 2016 Workshops, vol.1, pp.685-701, 2016.

S. Kim, D. Min, B. Ham, S. Jeon, S. Lin et al., Fcss: Fully convolutional self-similarity for dense semantic correspondence, CVPR, 2017.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, NIPS, 2012.

L. Jonathan, N. Long, T. Zhang, and . Darrell, Do convnets learn correspondence, NIPS. 2014. 1, vol.2, p.3

G. David and . Lowe, Object recognition from local scale-invariant features, ICCV, vol.1, pp.1150-1157, 1999.

J. Matas, O. Chum, U. Martin, and T. Pajdla, Robust wide baseline stereo from maximally stable extremal regions, BMVC, pp.384-393, 2002.

K. Mikolajczyk and J. Matas, Improving descriptors for fast tree matching by optimal linear projection, CVPR, vol.6, p.7, 2007.

K. Mikolajczyk and C. Schmid, Scale and affine invariant interest point detectors, IJCV, vol.60, issue.1, pp.63-86, 2004.

D. Nistér and H. Stewénius, Scalable recognition with a vocabulary tree, CVPR, pp.2161-2168, 2001.

H. Noh, A. Araujo, J. Sim, T. Weyand, and B. Han, Largescale image retrieval with attentive deep local features, ICCV, 2008.

M. Perdoch, O. Chum, and J. Matas, Efficient representation of local geometry for large scale object retrieval, CVPR, 2001.

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, Object retrieval with large vocabularies and fast spatial matching, CVPR, 2007.

F. Radenovi? and A. Iscen, Revisiting oxford and paris: Large-scale image retrieval benchmarking, Giorgos Tolias, Yannis Avrithis, and Ond?ej Chum, vol.7, 2006.

F. Radenovi?, G. Tolias, and O. Chum, CNN image retrieval learns from bow: Unsupervised fine-tuning with hard examples, ECCV, vol.7, 2016.

F. Radenovi?, G. Tolias, and O. Chum, Finetuning cnn image retrieval with no human annotation, IEEE Trans. PAMI, 2008.

A. Sharif-razavian, J. Sullivan, A. Maki, and S. Carlsson, Visual instance retrieval with deep convolutional networks, 2014.

I. Rocco, R. Arandjelovic, and J. Sivic, End-toend weakly-supervised semantic alignment, CVPR, vol.1, p.3, 2018.

I. Rocco, M. Cimpoi, R. Arandjelovi?, A. Torii, T. Pajdla et al., Neighbourhood Consensus Networks, NIPS, 2002.
URL : https://hal.archives-ouvertes.fr/hal-01905474

R. Ramprasaath, M. Selvaraju, A. Cogswell, R. Das, D. Vedantam et al., Grad-cam: Visual explanations from deep networks via gradient-based localization, 2017 IEEE International Conference on Computer Vision (ICCV), pp.618-626, 2002.

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition. ICLR, vol.3, p.6, 2014.

J. Sivic and A. Zisserman, Video Google: A text retrieval approach to object matching in videos, ICCV, 2003.

G. Tolias and Y. Avrithis, Speeded-up, relaxed spatial matching, ICCV, 2011.

G. Tolias, Y. Avrithis, and H. Jégou, To aggregate or not to aggregate: Selective match kernels for image search, ICCV, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00864684

G. Tolias, Y. Avrithis, and H. Jégou, Image search with selective match kernels: aggregation across single and multiple images. IJCV, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01131898

G. Tolias and H. Jegou, Visual query expansion with or without geometry: Refining local descriptors by feature aggregation, Pattern Recognition, vol.2, issue.8, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00971267

G. Tolias, R. Sicre, and H. Jégou, Particular object retrieval with integral max-pooling of cnn activations. ICLR, p.6, 2003.
URL : https://hal.archives-ouvertes.fr/hal-01842218

A. Vedaldi and B. Fulkerson, VLFeat: An open and portable library of computer vision algorithms, 2008.

E. Kwang-moo-yi, V. Trulls, P. Lepetit, and . Fua, Lift: Learned invariant feature transform, ECCV, vol.1, 2016.

B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, Learning deep features for discriminative localization, CVPR, 2002.