inria-00548641, version 3
Compact Video Description for Copy Detection with Precise Temporal Alignment
Matthijs Douze
a, 1Hervé Jégou
a, 2Cordelia Schmid
a, 1Patrick Pérez
a, 3
European Conference on Computer Vision (ECCV '10) 6311 (2010) 522--535
Abstract: This paper introduces a very compact yet discriminative video description, which allows example-based search in a large number of frames corresponding to thousands of hours of video. Our description extracts one descriptor per indexed video frame by aggregating a set of local descriptors. These frame descriptors are encoded using a time-aware hierarchical indexing structure. A modified temporal Hough voting scheme is used to rank the retrieved database videos and estimate segments in them that match the query. If we use a dense temporal description of the videos, matched video segments are localized with excellent precision. Experimental results on the Trecvid 2008 copy detection task and a set of 38000 videos from YouTube show that our method offers an excellent trade-off between search accuracy, efficiency and memory usage.
- a – INRIA
- 1: LEAR (INRIA Grenoble Rhône-Alpes / LJK Laboratoire Jean Kuntzmann)
- CNRS : FR71 – CNRS : UMR5527 – INRIA – Laboratoire Jean Kuntzmann – Université Joseph Fourier - Grenoble I – Institut National Polytechnique de Grenoble (INPG)
- 2: TEXMEX (INRIA - IRISA)
- CNRS : UMR6074 – INRIA – INSA Rennes – Université de Rennes 1
- 3: Technicolor R & I
- Technicolor
- Domain : Computer Science/Computer Vision and Pattern Recognition
- Available versions : v1 (2010-12-20) v2 (2011-03-22) v3 (2011-03-23)
- inria-00548641, version 3
- http://hal.inria.fr/inria-00548641
- oai:hal.inria.fr:inria-00548641
- From: Hervé Jégou
- Submitted on: Tuesday, 22 March 2011 21:22:46
- Updated on: Thursday, 24 March 2011 10:07:41







Associated documents
See also
Export