One-Step Time-Dependent Future Video Frame Prediction with a Convolutional Encoder-Decoder Neural Network

Abstract : There is an inherent need for machines to have a notion of how entities within their environment behave and to anticipate changes in the near future. In this work, we focus on anticipating future appearance, given the current frame of a video. Typical methods are used either to predict the next frame of a video or to predict future optical flow or trajectories based on a single video frame. This work presents an experiment on stretching the ability of CNNs to anticipate appearance at an arbitrarily given near future time, by conditioning our predicted video frames on a continuous time variable. We show that CNNs can learn an intrinsic representation of typical appearance changes over time and successfully generate realistic predictions in one step-at a deliberate time difference in the near future. The method is evaluated on the KTH human actions dataset and compared to a baseline consisting of an analogous CNN architecture that is not time-aware.
Complete list of metadatas

Cited literature [27 references]  Display  Hide  Download

https://hal.inria.fr/hal-01467064
Contributor : Vedran Vukotić <>
Submitted on : Tuesday, February 14, 2017 - 9:33:14 AM
Last modification on : Thursday, February 7, 2019 - 4:33:25 PM
Long-term archiving on : Monday, May 15, 2017 - 12:33:26 PM

File

Vukotic_NCCV_2016.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01467064, version 1

Citation

Vedran Vukotić, Silvia-Laura Pintea, Christian Raymond, Guillaume Gravier, Jan Van Gemert. One-Step Time-Dependent Future Video Frame Prediction with a Convolutional Encoder-Decoder Neural Network. Netherlands Conference on Computer Vision (NCCV), Dec 2016, Lunteren, Netherlands. ⟨hal-01467064⟩

Share

Metrics

Record views

1588

Files downloads

1716