Pose Estimation and Segmentation of Multiple People in Stereoscopic Movies

Guillaume Seguin 1, 2 Karteek Alahari 3 Josef Sivic 1, 2 Ivan Laptev 1, 2
1 WILLOW - Models of visual object recognition and scene understanding
DI-ENS - Département d'informatique de l'École normale supérieure, ENS Paris - École normale supérieure - Paris, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
3 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : We describe a method to obtain a pixel-wise segmentation and pose estimation of multiple people in stereoscopic videos. This task involves challenges such as dealing with unconstrained stereoscopic video, non-stationary cameras, and complex indoor and outdoor dynamic scenes with multiple people. We cast the problem as a discrete labelling task involving multiple person labels, devise a suitable cost function, and optimize it efficiently. The contributions of our work are two-fold: First, we develop a segmentation model incorporating person detections and learnt articulated pose segmentation masks, as well as colour, motion, and stereo disparity cues. The model also explicitly represents depth ordering and occlusion. Second, we introduce a stereoscopic dataset with frames extracted from feature-length movies "StreetDance 3D" and "Pina". The dataset contains 587 annotated human poses, 1158 bounding box annotations and 686 pixel-wise segmentations of people. The dataset is composed of indoor and outdoor scenes depicting multiple people with frequent occlusions. We demonstrate results on our new challenging dataset, as well as on the H2view dataset from (Sheasby et al. ACCV 2012).
Type de document :
Article dans une revue
IEEE Transactions on Pattern Analysis and Machine Intelligence, Institute of Electrical and Electronics Engineers, 2015, 37 (8), pp.1643 - 1655. <10.1109/TPAMI.2014.2369050>
Liste complète des métadonnées


https://hal.inria.fr/hal-01089660
Contributeur : Karteek Alahari <>
Soumis le : vendredi 7 août 2015 - 19:35:26
Dernière modification le : jeudi 29 septembre 2016 - 01:22:39
Document(s) archivé(s) le : mercredi 26 avril 2017 - 09:44:40

Fichier

seguin15.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Guillaume Seguin, Karteek Alahari, Josef Sivic, Ivan Laptev. Pose Estimation and Segmentation of Multiple People in Stereoscopic Movies. IEEE Transactions on Pattern Analysis and Machine Intelligence, Institute of Electrical and Electronics Engineers, 2015, 37 (8), pp.1643 - 1655. <10.1109/TPAMI.2014.2369050>. <hal-01089660v2>

Partager

Métriques

Consultations de
la notice

669

Téléchargements du document

559