Compact Video Description for Copy Detection with Precise Temporal Alignment
Abstract
This paper introduces a very compact yet discriminative video description, which allows example-based search in a large number of frames corresponding to thousands of hours of video. Our description extracts one descriptor per indexed video frame by aggregating a set of local descriptors. These frame descriptors are encoded using a time-aware hierarchical indexing structure. A modified temporal Hough voting scheme is used to rank the retrieved database videos and estimate segments in them that match the query. If we use a dense temporal description of the videos, matched video segments are localized with excellent precision. Experimental results on the Trecvid 2008 copy detection task and a set of 38000 videos from YouTube show that our method offers an excellent trade-off between search accuracy, efficiency and memory usage.
Fichier principal
paper_hal.pdf (1.69 Mo)
Télécharger le fichier
vignette_hal.jpg (23.24 Ko)
Télécharger le fichier
Origin : Files produced by the author(s)
Format : Figure, Image
Loading...