Higher-order Occurrence Pooling on Mid- and Low-level Features: Visual Concept Detection

Piotr Koniusz 1, 2, * Fei Yan 1 Philippe-Henri Gosselin 3, 4 Krystian Mikolajczyk 1
* Auteur correspondant
2 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
3 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : In object recognition, the Bag-of-Words model assumes: i) extraction of local descriptors from images, ii) embedding these descriptors by a coder to a given visual vocabulary space which results in so-called mid-level features, iii) extracting statistics from mid-level features with a pooling operator that aggregates occurrences of visual words in images into so-called signatures. As the last step aggregates only occurrences of visual words, it is called as First-order Occurrence Pooling. This paper investigates higher-order approaches. We propose to aggregate over co-occurrences of visual words, derive Bag-of-Words with Second- and Higher-order Occurrence Pooling based on linearisation of so-called Minor Polynomial Kernel, and extend this model to work with adequate pooling operators. For bi- and multi-modal coding, a novel higher-order fusion is derived. We show that the well-known Spatial Pyramid Matching and related methods constitute its special cases. Moreover, we propose Third-order Occurrence Pooling directly on local image descriptors and a novel pooling operator that removes undesired correlation from the image signatures. Finally, Uni- and Bi-modal First-, Second-, and Third-order Occurrence Pooling are evaluated given various coders and pooling operators. The proposed methods are compared to other approaches (e.g. Fisher Vector Encoding) in the same testbed and attain state-of-the-art results.
Type de document :
[Technical Report] 2013, pp.20
Liste complète des métadonnées

Contributeur : Piotr Koniusz <>
Soumis le : vendredi 27 décembre 2013 - 14:17:19
Dernière modification le : vendredi 13 janvier 2017 - 14:20:15
Document(s) archivé(s) le : jeudi 27 mars 2014 - 23:50:18


Fichiers produits par l'(les) auteur(s)


  • HAL Id : hal-00922524, version 1


Piotr Koniusz, Fei Yan, Philippe-Henri Gosselin, Krystian Mikolajczyk. Higher-order Occurrence Pooling on Mid- and Low-level Features: Visual Concept Detection. [Technical Report] 2013, pp.20. <hal-00922524>



Consultations de
la notice


Téléchargements du document