Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning

Abstract : Object category localization is a challenging problem in computer vision. Standard supervised training requires bounding box annotations of object instances. This time-consuming annotation process is sidestepped in weakly supervised learning. In this case, the supervised information is restricted to binary labels that indicate the absence/presence of object instances in the image, without their locations. We follow a multiple-instance learning approach that iteratively trains the detector and infers the object locations in the positive training images. Our main contribution is a multi-fold multiple instance learning procedure, which prevents training from prematurely locking onto erroneous object locations. This procedure is particularly important when using high-dimensional representations, such as Fisher vectors and convolutional neural network features. We also propose a window refinement method, which improves the localization accuracy by incorporating an objectness prior. We present a detailed experimental evaluation using the PASCAL VOC 2007 dataset, which verifies the effectiveness of our approach.
Document type :
Journal articles
Liste complète des métadonnées

Contributor : Thoth Team <>
Submitted on : Monday, February 22, 2016 - 4:32:28 PM
Last modification on : Wednesday, April 11, 2018 - 1:57:52 AM
Document(s) archivé(s) le : Monday, May 23, 2016 - 2:33:41 PM


Files produced by the author(s)




Ramazan Gokberk Cinbis, Jakob Verbeek, Cordelia Schmid. Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, Institute of Electrical and Electronics Engineers, 2017, 39 (1), pp.189-203. ⟨10.1109/TPAMI.2016.2535231⟩. ⟨hal-01123482v3⟩



Record views


Files downloads