Circulant temporal encoding for video retrieval and temporal alignment

Matthijs Douze 1, 2 Jérôme Revaud 2 Jakob Verbeek 3, 2 Hervé Jégou 4 Cordelia Schmid 2
2 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
3 Thoth - Apprentissage de modèles à partir de données massives
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann
Abstract : We address the problem of specific video event retrieval. Given a query video of a specific event, e.g., a concert of Madonna, the goal is to retrieve other videos of the same event that temporally overlap with the query. Our approach encodes the frame descriptors of a video to jointly represent their appearance and temporal order. It exploits the properties of circulant matrices to efficiently compare the videos in the frequency domain. This offers a significant gain in complexity and accurately localizes the matching parts of videos. The descriptors can be compressed in the frequency domain with a product quantizer adapted to complex numbers. In this case, video retrieval is performed without decompressing the descriptors. We also consider the temporal alignment of a set of videos. We exploit the matching confidence and an estimate of the temporal offset computed for all pairs of videos by our retrieval approach. Our robust algorithm aligns the videos on a global timeline by maximizing the set of temporally consistent matches. The global temporal alignment enables synchronous playback of the videos of a given scene.
Type de document :
Article dans une revue
International Journal of Computer Vision, Springer Verlag, 2016, 119 (3), pp.291-306. <10.1007/s11263-015-0875-0>
Liste complète des métadonnées


https://hal.inria.fr/hal-01162603
Contributeur : Thoth Team <>
Soumis le : lundi 30 novembre 2015 - 18:12:16
Dernière modification le : jeudi 27 avril 2017 - 14:06:29
Document(s) archivé(s) le : samedi 29 avril 2017 - 03:07:26

Fichier

paper.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Matthijs Douze, Jérôme Revaud, Jakob Verbeek, Hervé Jégou, Cordelia Schmid. Circulant temporal encoding for video retrieval and temporal alignment. International Journal of Computer Vision, Springer Verlag, 2016, 119 (3), pp.291-306. <10.1007/s11263-015-0875-0>. <hal-01162603v2>

Partager

Métriques

Consultations de
la notice

728

Téléchargements du document

452