R. Arandjelovi´carandjelovi´c and A. Zisserman, Three things everyone should know to improve object retrieval, Proc. CVPR, p.7, 2012.

R. Arandjelovi´carandjelovi´c and A. Zisserman, All about VLAD, Proc. CVPR, 2007.

R. Arandjelovi´carandjelovi´c and A. Zisserman, DisLocation: Scalable descriptor distinctiveness for location recognition, Proc. ACCV, 2006.

M. Aubry, B. C. Russell, and J. Sivic, Painting-to-3D model alignment via discriminative visual elements, ACM Transactions on Graphics, vol.33, issue.2, p.14, 2014.
DOI : 10.1145/2591009
URL : https://hal.archives-ouvertes.fr/hal-00863615

H. Azizpour, A. Razavian, J. Sullivan, A. Maki, and S. Carlsson, Factors of transferability from a generic ConvNet representation . CoRR, abs/1406, p.13, 2014.

A. Babenko and V. Lempitsky, Aggregating local deep features for image retrieval, Proc. ICCV, p.14, 2015.

A. Babenko, A. Slesarev, A. Chigorin, and V. Lempitsky, Neural Codes for Image Retrieval, Proc. ECCV, p.14, 2014.
DOI : 10.1007/978-3-319-10590-1_38

S. Cao and N. Snavely, Graph-based discriminative learning for location recognition, Proc. CVPR, 2013.

D. M. Chen, G. Baatz, K. Koeser, S. S. Tsai, R. Vedantham et al., City-scale landmark identification on mobile devices, CVPR 2011, p.11, 2005.
DOI : 10.1109/CVPR.2011.5995610

O. Chum, A. Mikulik, M. Pe?, and J. Matas, Total recall II: Query expansion revisited, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995601

O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman, Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval, 2007 IEEE 11th International Conference on Computer Vision, 2007.
DOI : 10.1109/ICCV.2007.4408891

M. Cimpoi, S. Maji, and A. Vedaldi, Deep filter banks for texture recognition and segmentation, Proc. CVPR, p.4, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01263622

G. Csurka, C. Bray, C. Dance, and L. Fan, Visual categorization with bags of keypoints, Workshop on Statistical Learning in Computer Vision, ECCV, pp.1-22, 2004.

M. Cummins, P. Newman, and . Fab-map, Probabilistic localization and mapping in the space of appearance, The International Journal of Robotics Research, issue.1 2, 2008.

M. Cummins and P. Newman, Highly scalable appearanceonly SLAM -FAB-MAP 2.0, RSS, 2009.

J. Delhumeau, P. Gosselin, H. Jégou, and P. Pérez, Revisiting the VLAD image representation, Proceedings of the 21st ACM international conference on Multimedia, MM '13, 2013.
DOI : 10.1145/2502081.2502171
URL : https://hal.archives-ouvertes.fr/hal-00840653

J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang et al., DeCAF: A deep convolutional activation feature for generic visual recognition, 1310.

J. Foulds and E. Frank, A review of multi-instance learning assumptions, The Knowledge Engineering Review, vol.2, issue.01, pp.1-25, 2010.
DOI : 10.1016/S0004-3702(96)00034-3

R. B. Girshick, J. Donahue, T. Darrell, and J. Malik, Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.13, 2014.
DOI : 10.1109/CVPR.2014.81

Y. Gong, L. Wang, R. Guo, and S. Lazebnik, Multi-scale Orderless Pooling of Deep Convolutional Activation Features
DOI : 10.1007/978-3-319-10584-0_26

A. Gordo, J. A. Rodríguez-serrano, F. Perronnin, and E. Valveny, Leveraging category-level labels for instance-level image retrieval, 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp.3045-3052, 2012.
DOI : 10.1109/CVPR.2012.6248035

P. Gronat, G. Obozinski, J. Sivic, and T. Pajdla, Learning and calibrating per-location classifiers for visual place recognition, Proc. CVPR, p.11, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00934332

H. Jégou and O. Chum, Negative evidences and cooccurrences in image retrieval: the benefit of PCA and whitening, Proc. ECCV, p.7, 2012.

H. Jégou, M. Douze, and C. Schmid, Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search
DOI : 10.1007/978-3-540-88682-2_24

H. Jégou, M. Douze, and C. Schmid, On the burstiness of visual elements, 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2002.
DOI : 10.1109/CVPR.2009.5206609

H. Jégou, M. Douze, and C. Schmid, Product Quantization for Nearest Neighbor Search, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.33, issue.1, 2011.
DOI : 10.1109/TPAMI.2010.57

H. Jégou, M. Douze, C. Schmid, and P. Pérez, Aggregating local descriptors into a compact image representation, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2007.
DOI : 10.1109/CVPR.2010.5540039

H. Jégou, H. Harzallah, and C. Schmid, A contextual dissimilarity measure for accurate and efficient image search, 2007 IEEE Conference on Computer Vision and Pattern Recognition
DOI : 10.1109/CVPR.2007.382970

H. Jégou, F. Perronnin, M. Douze, J. Sánchez, P. Pérez et al., Aggregating Local Image Descriptors into Compact Codes, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.9, 2012.
DOI : 10.1109/TPAMI.2011.235

H. Jégou and A. Zisserman, Triangulation Embedding and Democratic Aggregation for Image Search, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.14, 2014.
DOI : 10.1109/CVPR.2014.417

A. Karpathy and L. Fei-fei, Deep visual-semantic alignments for generating image descriptions, Proc. CVPR, 2015.

A. Kendall, M. Grimes, and R. Cipolla, PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization, 2015 IEEE International Conference on Computer Vision (ICCV)
DOI : 10.1109/ICCV.2015.336

J. Knopp, J. Sivic, and T. Pajdla, Avoiding Confusing Features in Place Recognition, Proc. ECCV, 2010.
DOI : 10.1007/978-3-642-15549-9_54

D. Kotzias, M. Denil, P. Blunsom, and N. De-freitas, Deep multi-instance transfer learning. CoRR, abs/1411, p.3128, 2014.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, NIPS, pp.1106-1114, 2006.

Y. Lecun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard et al., Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, vol.1, issue.4, pp.541-551, 1989.
DOI : 10.1007/BF00133697

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradientbased learning applied to document recognition, Proceedings of the IEEE, pp.2278-2324, 1998.

Y. Li, N. Snavely, D. Huttenlocher, and P. Fua, Worldwide pose estimation using 3D point clouds, Proc. ECCV, 2012.

T. Lin, Y. Cui, S. Belongie, and J. Hays, Learning deep representations for ground-to-aerial geolocalization, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
DOI : 10.1109/CVPR.2015.7299135

T. Lin, A. Roychowdhury, and S. Maji, Bilinear CNN Models for Fine-Grained Visual Recognition, 2015 IEEE International Conference on Computer Vision (ICCV), 2015.
DOI : 10.1109/ICCV.2015.170

D. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, vol.60, issue.2, pp.91-110, 2004.
DOI : 10.1023/B:VISI.0000029664.99615.94

W. Maddern and S. Vidas, Towards robust night and day place recognition using visible and thermal imaging, Proc

A. Makadia, Feature Tracking for Wide-Baseline Image Retrieval, Proc. ECCV, 2010.
DOI : 10.1007/978-3-642-15555-0_23

C. Mcmanus, W. Churchill, W. Maddern, A. Stewart, and P. Newman, Shady dealings: Robust, long-term visual localisation using illumination invariance, 2014 IEEE International Conference on Robotics and Automation (ICRA), 2014.
DOI : 10.1109/ICRA.2014.6906961

S. Middelberg, T. Sattler, O. Untzelmann, and L. Kobbelt, Scalable 6-DOF Localization on Mobile Devices, Proc. ECCV, 2014.
DOI : 10.1007/978-3-319-10605-2_18

A. Mikulik, M. Pe?-doch, O. Chum, and J. Matas, Learning a Fine Vocabulary, Proc. ECCV, 2010.
DOI : 10.1007/978-3-642-15558-1_1

M. Oquab, L. Bottou, I. Laptev, and J. Sivic, Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.13, 2014.
DOI : 10.1109/CVPR.2014.222
URL : https://hal.archives-ouvertes.fr/hal-00911179

M. Paulin, M. Douze, Z. Harchaoui, J. Mairal, F. Perronnin et al., Local Convolutional Features with Unsupervised Training for Image Retrieval, 2015 IEEE International Conference on Computer Vision (ICCV), 2015.
DOI : 10.1109/ICCV.2015.19
URL : https://hal.archives-ouvertes.fr/hal-01207966

F. Perronnin and D. Dance, Fisher Kernels on Visual Vocabularies for Image Categorization, 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007.
DOI : 10.1109/CVPR.2007.383266

F. Perronnin, Y. Liu, J. Sánchez, and H. Poirier, Large-scale image retrieval with compressed Fisher vectors, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010.
DOI : 10.1109/CVPR.2010.5540009

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, Object retrieval with large vocabularies and fast spatial matching, 2007 IEEE Conference on Computer Vision and Pattern Recognition, p.14, 2007.
DOI : 10.1109/CVPR.2007.383172

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, Lost in quantization: Improving particular object retrieval in large scale image databases, 2008 IEEE Conference on Computer Vision and Pattern Recognition, p.14, 2008.
DOI : 10.1109/CVPR.2008.4587635

J. Philbin, M. Isard, J. Sivic, and A. Zisserman, Descriptor Learning for Efficient Retrieval, Proc. ECCV, 2010.
DOI : 10.1007/978-3-642-15558-1_49

D. Qin, X. Chen, M. Guillaumin, and L. V. , Quantized kernel learning for feature matching, NIPS, 2014.

D. Qin, Y. Chen, M. Guillaumin, and L. V. , Learning to rank bag-of-word histograms for large-scale object retrieval

D. Qin, S. Gammeter, L. Bossard, T. Quack, and L. Van-gool, Hello neighbor: Accurate object retrieval with k-reciprocal nearest neighbors, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995373

D. Qin, C. Wengert, and L. V. , Query Adaptive Similarity for Large Scale Object Retrieval, 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013.
DOI : 10.1109/CVPR.2013.211

A. S. Razavian, H. Azizpour, J. Sullivan, and S. Carlsson, CNN features off-the-shelf: An astounding baseline for recognition. CoRR, abs/1403, p.13, 2014.

A. S. Razavian, J. Sullivan, A. Maki, and S. Carlsson, A baseline for visual instance retrieval with deep convolutional networks, p.14, 2014.

A. S. Razavian, J. Sullivan, A. Maki, S. Carlsson-havlena, F. Radenovi´cradenovi´c et al., A baseline for visual instance retrieval with deep convolutional networks Hyperpoints and fine vocabularies for largescale location recognition, Proc. ICLR Proc. ICCV, p.14, 2015.

T. Sattler, B. Leibe, and L. Kobbelt, Fast image-based localization using direct 2D-to-3D matching, 2011 International Conference on Computer Vision, 2011.
DOI : 10.1109/ICCV.2011.6126302

T. Sattler, T. Weyand, B. Leibe, and L. Kobbelt, Image Retrieval for Image-Based Localization Revisited, Procedings of the British Machine Vision Conference 2012
DOI : 10.5244/C.26.76

G. Schindler, M. Brown, and R. Szeliski, City-Scale Location Recognition, 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007.
DOI : 10.1109/CVPR.2007.383150

F. Schroff, D. Kalenichenko, and J. Philbin, FaceNet: A unified embedding for face recognition and clustering, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
DOI : 10.1109/CVPR.2015.7298682

M. Schultz and T. Joachims, Learning a distance metric from relative comparisons, NIPS, 2004.

P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus et al., OverFeat: Integrated recognition, localization and detection using convolutional networks, 1312.

E. Simo-serra, E. Trulls, L. Ferraz, I. Kokkinos, and F. Moreno-noguer, Fracking deep convolutional image descriptors . CoRR, abs/1412, 2014.

K. Simonyan, A. Vedaldi, and A. Zisserman, Descriptor Learning Using Convex Optimisation, Proc. ECCV, 2012.
DOI : 10.1007/978-3-642-33718-5_18

K. Simonyan, A. Vedaldi, and A. Zisserman, Deep Fisher networks for large-scale image classification, NIPS, 2013.

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, Proc. ICLR, p.11, 2015.

J. Sivic and A. Zisserman, Video Google: a text retrieval approach to object matching in videos, Proceedings Ninth IEEE International Conference on Computer Vision, pp.1470-1477, 2003.
DOI : 10.1109/ICCV.2003.1238663

N. Sunderhauf, S. Shirazi, A. Jacobson, E. Pepperell, F. Dayoub et al., Place Recognition with ConvNet Landmarks: Viewpoint-Robust, Condition-Robust, Training-Free, Robotics: Science and Systems XI, 2015.
DOI : 10.15607/RSS.2015.XI.022

V. Sydorov, M. Sakurada, and C. Lampert, Deep Fisher Kernels -- End to End Learning of the Fisher Kernel GMM Parameters, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
DOI : 10.1109/CVPR.2014.182

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed et al., Going deeper with convolutions, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
DOI : 10.1109/CVPR.2015.7298594

G. Tolias, Y. Avrithis, and H. Jégou, To Aggregate or Not to aggregate: Selective Match Kernels for Image Search, 2013 IEEE International Conference on Computer Vision, 2013.
DOI : 10.1109/ICCV.2013.177
URL : https://hal.archives-ouvertes.fr/hal-00864684

G. Tolias and H. Jégou, Visual query expansion with or without geometry: Refining local descriptors by feature aggregation, Pattern Recognition, vol.47, issue.10, 2014.
DOI : 10.1016/j.patcog.2014.04.007
URL : https://hal.archives-ouvertes.fr/hal-00971267

A. Torii, R. Arandjelovi´carandjelovi´c, J. Sivic, M. Okutomi, and T. Pajdla, 24/7 place recognition by view synthesis, Proc. CVPR, p.15, 2011.
URL : https://hal.archives-ouvertes.fr/hal-01147212

A. Torii, J. Sivic, T. Pajdla, and M. Okutomi, Visual place recognition with repetitive structures, Proc. CVPR, p.12, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00934288

T. Turcot and D. G. Lowe, Better matching with fewer features: The selection of useful features in large database recognition problems, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 2009.
DOI : 10.1109/ICCVW.2009.5457541

T. Tuytelaars and K. Mikolajczyk, Local invariant feature detectors: A survey. Foundations and Trends R in Computer Graphics and Vision, pp.177-280, 2008.

A. Vedaldi and K. Lenc, Matconvnet ? convolutional neural networks for matlab, Proc. ACMM, 2015.

P. Viola, J. C. Platt, and C. Zhang, Multiple instance boosting for object detection, NIPS, 2005.

J. Wang, Y. Song, T. Leung, C. Rosenberg, J. Wang et al., Learning Fine-Grained Image Similarity with Deep Ranking, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
DOI : 10.1109/CVPR.2014.180

K. Q. Weinberger, J. Blitzer, and L. Saul, Distance metric learning for large margin nearest neighbor classification, NIPS, 2006.

S. Winder, G. Hua, and M. Brown, Picking the best DAISY, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.178-185, 2009.
DOI : 10.1109/CVPR.2009.5206839

M. D. Zeiler and R. Fergus, Visualizing and Understanding Convolutional Networks, Proc. ECCV, p.13, 2014.
DOI : 10.1007/978-3-319-10590-1_53

J. Zepeda and P. Pérez, Exemplar SVMs as visual feature encoders, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
DOI : 10.1109/CVPR.2015.7298924

B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva, Learning deep features for scene recognition using places database, NIPS, p.12, 2014.