Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

NetVLAD: CNN architecture for weakly supervised place recognition

Relja Arandjelović 1, 2 Petr Gronat 1, 2 Akihiko Torii 3 Tomas Pajdla 4 Josef Sivic 1, 2
1 WILLOW - Models of visual object recognition and scene understanding
CNRS - Centre National de la Recherche Scientifique : UMR8548, Inria Paris-Rocquencourt, DI-ENS - Département d'informatique de l'École normale supérieure
Abstract : We tackle the problem of large scale visual place recognition , where the task is to quickly and accurately recognize the location of a given query photograph. We present the following three principal contributions. First, we develop a convolutional neural network (CNN) architecture that is trainable in an end-to-end manner directly for the place recognition task. The main component of this architecture, NetVLAD, is a new generalized VLAD layer, inspired by the " Vector of Locally Aggregated Descriptors " image representation commonly used in image retrieval. The layer is readily pluggable into any CNN architecture and amenable to training via backpropagation. Second, we develop a training procedure, based on a new weakly supervised ranking loss, to learn parameters of the architecture in an end-to-end manner from images depicting the same places over time downloaded from Google Street View Time Machine. Finally, we show that the proposed architecture obtains a large improvement in performance over non-learnt image representations as well as significantly outperforms off-the-shelf CNN descriptors on two challenging place recognition benchmarks, and outperforms current state-of-the-art compact image representations on standard image retrieval benchmarks.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas

https://hal.inria.fr/hal-01242052
Contributor : Relja Arandjelović <>
Submitted on : Thursday, March 10, 2016 - 6:28:06 PM
Last modification on : Tuesday, January 19, 2021 - 10:16:02 AM

File

cvpr16_place.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01242052, version 2
  • ARXIV : 1511.07247

Citation

Relja Arandjelović, Petr Gronat, Akihiko Torii, Tomas Pajdla, Josef Sivic. NetVLAD: CNN architecture for weakly supervised place recognition. 2015. ⟨hal-01242052v2⟩

Share

Metrics

Record views

33

Files downloads

249