Higher-order Occurrence Pooling on Mid- and Low-level Features: Visual Concept Detection

Piotr Koniusz 1, 2, * Fei Yan 1 Philippe-Henri Gosselin 3, 4 Krystian Mikolajczyk 1
* Corresponding author
2 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
3 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : In object recognition, the Bag-of-Words model assumes: i) extraction of local descriptors from images, ii) embedding these descriptors by a coder to a given visual vocabulary space which results in so-called mid-level features, iii) extracting statistics from mid-level features with a pooling operator that aggregates occurrences of visual words in images into so-called signatures. As the last step aggregates only occurrences of visual words, it is called as First-order Occurrence Pooling. This paper investigates higher-order approaches. We propose to aggregate over co-occurrences of visual words, derive Bag-of-Words with Second- and Higher-order Occurrence Pooling based on linearisation of so-called Minor Polynomial Kernel, and extend this model to work with adequate pooling operators. For bi- and multi-modal coding, a novel higher-order fusion is derived. We show that the well-known Spatial Pyramid Matching and related methods constitute its special cases. Moreover, we propose Third-order Occurrence Pooling directly on local image descriptors and a novel pooling operator that removes undesired correlation from the image signatures. Finally, Uni- and Bi-modal First-, Second-, and Third-order Occurrence Pooling are evaluated given various coders and pooling operators. The proposed methods are compared to other approaches (e.g. Fisher Vector Encoding) in the same testbed and attain state-of-the-art results.
Liste complète des métadonnées

Cited literature [58 references]  Display  Hide  Download

https://hal.inria.fr/hal-00922524
Contributor : Piotr Koniusz <>
Submitted on : Friday, December 27, 2013 - 2:17:19 PM
Last modification on : Monday, December 17, 2018 - 11:22:02 AM
Document(s) archivé(s) le : Thursday, March 27, 2014 - 11:50:18 PM

File

pkpami2c.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00922524, version 1

Citation

Piotr Koniusz, Fei Yan, Philippe-Henri Gosselin, Krystian Mikolajczyk. Higher-order Occurrence Pooling on Mid- and Low-level Features: Visual Concept Detection. [Technical Report] 2013, pp.20. ⟨hal-00922524⟩

Share

Metrics

Record views

1707

Files downloads

788