M. Ranzato, A. Szlam, J. Bruna, M. Mathieu, R. Collobert et al., Video (language) modeling: a baseline for generative models of natural videos. arXiv 1412, p.6604, 2014.

N. Srivastava, E. Mansimov, and R. Salakhutdinov, Unsupervised learning of video representations using LSTMs, In: ICML, 2015.

M. Mathieu, C. Couprie, and Y. Lecun, Deep multi-scale video prediction beyond mean square error, In: ICLR, 2016.

C. Finn, I. Goodfellow, and S. Levine, Unsupervised learning for physical interaction through video prediction, In: NIPS, 2016.

E. Denton and V. Birodkar, Unsupervised learning of disentangled representations from video, NIPS, 2017.

P. Luc, N. Neverova, C. Couprie, J. Verbeek, and Y. Lecun, Predicting Deeper into the Future of Semantic Segmentation, 2017 IEEE International Conference on Computer Vision (ICCV), 2017.
DOI : 10.1109/ICCV.2017.77

URL : https://hal.archives-ouvertes.fr/hal-01494296

P. Luc, C. Couprie, J. Verbeek, and Y. Lecun, Predictive learning in feature space for future instance segmentation, In: ECCV, 2018.

J. Oh, X. Guo, H. Lee, R. L. Lewis, and S. P. Singh, Action-conditional video prediction using deep networks in atari games, 2015.

A. Dosovitskiy and V. Koltun, Learning to act by predicting the future, In: ICLR, 2017.

J. Walker, C. Doersch, A. Gupta, and M. Hebert, An Uncertain Future: Forecasting from Static Images Using Variational Autoencoders, In: ECCV, vol.33, issue.5, 2016.
DOI : 10.1007/978-3-642-15552-9_51

URL : http://arxiv.org/pdf/1606.07873

E. Denton and R. Fergus, Stochastic video generation with a learned prior, In: ICML, 2018.

M. Babaeizadeh, C. Finn, D. Erhan, R. H. Campbell, and S. Levine, Stochastic variational video prediction, ICLR, 2018.

C. Vondrick, H. Pirsiavash, and A. Torralba, Anticipating the future by watching unlabeled video, In: CVPR, 2016.
DOI : 10.1109/cvpr.2016.18

URL : http://arxiv.org/pdf/1504.08023

J. Walker, K. Marino, A. Gupta, and M. Hebert, The Pose Knows: Video Forecasting by Generating Pose Futures, 2017 IEEE International Conference on Computer Vision (ICCV), 2017.
DOI : 10.1109/ICCV.2017.361

URL : http://arxiv.org/pdf/1705.00053

A. Bhattacharyya, M. Fritz, and B. Schiele, Long-term on-board prediction of people in traffic scenes under uncertainty, CVPR, 2018.

X. Jin, H. Xiao, X. Shen, J. Yang, Z. Lin et al., Predicting scene parsing and motion dynamics in the future, NIPS, 2017.

B. Romera-paredes and P. Torr, Recurrent Instance Segmentation, In: ECCV, vol.27, issue.8, 2016.
DOI : 10.5244/C.29.CVPPP.1

URL : http://arxiv.org/pdf/1511.08250

M. Bai and R. Urtasun, Deep Watershed Transform for Instance Segmentation, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
DOI : 10.1109/CVPR.2017.305

URL : http://arxiv.org/pdf/1611.08303

A. Arnab and P. H. Torr, Pixelwise Instance Segmentation with a Dynamically Instantiated Network, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
DOI : 10.1109/CVPR.2017.100

URL : http://arxiv.org/pdf/1704.02386

P. Pinheiro, T. Y. Lin, R. Collobert, and P. Dollár, Learning to Refine Object Segments, In: ECCV, vol.38, issue.4, 2016.
DOI : 10.5244/C.30.15

URL : http://arxiv.org/pdf/1603.08695

K. He, G. Gkioxari, P. Dollár, and R. Girshick, , 2017.

S. Ren, K. He, R. Girshick, and J. Sun, Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.39, issue.6, 2015.
DOI : 10.1109/TPAMI.2016.2577031

URL : http://arxiv.org/pdf/1506.01497

T. Watanabe and D. Wolf, Distance to center of mass encoding for instance segmentation . arXiv 1711, p.9060, 2017.

Y. Boykov and M. P. Jolly, Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, 2001.
DOI : 10.1109/ICCV.2001.937505

Y. Boykov, O. Veksler, and R. Zabih, Fast approximate energy minimization via graph cuts, PAMI, vol.23, 2001.
DOI : 10.1109/34.969114

URL : http://www.csd.uwo.ca/~yuri/Papers/iccv99.pdf

L. Grady, Random Walks for Image Segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.28, issue.11, pp.1768-1783, 2006.
DOI : 10.1109/TPAMI.2006.233

URL : http://perso.telecom-paristech.fr/~bloch/P6Image/Projets/hippocampe/grady2006.pdf

X. Bai and G. Sapiro, Geodesic Matting: A Framework for Fast Interactive Image and??Video Segmentation and Matting, International Journal of Computer Vision, vol.212, issue.5, 2007.
DOI : 10.1561/0600000019

L. Grady and A. K. Sinop, Fast approximate Random Walker segmentation using eigenvector precomputation, 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008.
DOI : 10.1109/CVPR.2008.4587487

C. Alì-ene, J. Y. Audibert, M. Couprie, and R. Keriven, Some links between extremum spanning forests, watersheds and min-cuts, Image and Vision Computing, 2009.

C. Couprie, L. Grady, L. Najman, and H. Talbot, Power Watershed: A Unifying Graph-Based Optimization Framework, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.33, issue.7, pp.1384-1399, 2011.
DOI : 10.1109/TPAMI.2010.200

A. Meijster, J. B. Roerdink, and W. H. Hesselink, A General Algorithm for Computing Distance Transforms in Linear Time, pp.331-340, 2000.
DOI : 10.1007/0-306-47025-X_36

URL : https://www.rug.nl/research/portal/files/3059926/2002CompImagVisMeijster.pdf

L. Vincent and P. Soille, Watersheds in digital spaces: an efficient algorithm based on immersion simulations, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.13, issue.6, pp.583-598, 1991.
DOI : 10.1109/34.87344

M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler et al., The Cityscapes Dataset for Semantic Urban Scene Understanding, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
DOI : 10.1109/CVPR.2016.350

URL : http://arxiv.org/pdf/1604.01685