Skip to Main content Skip to Navigation
Conference papers

Track to the Future: Spatio-temporal Video Segmentation with Long-range Motion Cues

Jose Lezama 1 Karteek Alahari 2, 3 Josef Sivic 2, 3 Ivan Laptev 2, 3
2 WILLOW - Models of visual object recognition and scene understanding
DI-ENS - Département d'informatique de l'École normale supérieure, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
Abstract : Video provides not only rich visual cues such as motion and appearance, but also much less explored long-range temporal interactions among objects. We aim to capture such interactions and to construct a powerful intermediate-level video representation for subsequent recognition. Motivated by this goal, we seek to obtain spatio-temporal oversegmentation of a video into regions that respect object boundaries and, at the same time, associate object pixels over many video frames. The contributions of this paper are two-fold. First, we develop an efficient spatiotemporal video segmentation algorithm, which naturally incorporates long-range motion cues from the past and future frames in the form of clusters of point tracks with coherent motion. Second, we devise a new track clustering cost function that includes occlusion reasoning, in the form of depth ordering constraints, as well as motion similarity along the tracks. We evaluate the proposed approach on a challenging set of video sequences of office scenes from feature length movies.
Document type :
Conference papers
Complete list of metadatas

Cited literature [39 references]  Display  Hide  Download

https://hal.inria.fr/hal-00817961
Contributor : Karteek Alahari <>
Submitted on : Thursday, October 17, 2013 - 7:02:25 PM
Last modification on : Tuesday, September 22, 2020 - 3:50:14 AM
Long-term archiving on: : Saturday, January 18, 2014 - 2:40:18 AM

File

lezama11.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Jose Lezama, Karteek Alahari, Josef Sivic, Ivan Laptev. Track to the Future: Spatio-temporal Video Segmentation with Long-range Motion Cues. CVPR - IEEE Conference on Computer Vision and Pattern Recognition, Jun 2011, Colorado Springs, United States. pp.3369 - 3376, ⟨10.1109/CVPR.2011.6044588⟩. ⟨hal-00817961⟩

Share

Metrics

Record views

971

Files downloads

948