Multi-region two-stream R-CNN for action detection

Xiaojiang Peng; Cordelia Schmid

doi:10.1007/978-3-319-46493-0_45

Communication Dans Un Congrès Année : 2016

Multi-region two-stream R-CNN for action detection

(1) , (1)

Xiaojiang Peng

Fonction : Auteur

Apprentissage de modèles à partir de données massives

Cordelia Schmid

Fonction : Auteur

Apprentissage de modèles à partir de données massives

Résumé

We propose a multi-region two-stream R-CNN model for action detection in realistic videos. We start from frame-level action detection based on faster R-CNN [1], and make three contributions: (1) we show that a motion region proposal network generates high-quality proposals , which are complementary to those of an appearance region proposal network; (2) we show that stacking optical flow over several frames significantly improves frame-level action detection; and (3) we embed a multi-region scheme in the faster R-CNN model, which adds complementary information on body parts. We then link frame-level detections with the Viterbi algorithm, and temporally localize an action with the maximum subarray method. Experimental results on the UCF-Sports, J-HMDB and UCF101 action detection datasets show that our approach outperforms the state of the art with a significant margin in both frame-mAP and video-mAP.

Mots clés

Action detection faster R-CNN multi-region CNNs two stream R-CNN

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

eccv16-pxj-v3.pdf (4.5 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

THOTH Team : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01349107

Soumis le : jeudi 5 janvier 2017-15:50:22

Dernière modification le : jeudi 4 avril 2024-18:24:07

Dates et versions

hal-01349107 , version 1 (26-07-2016)

hal-01349107 , version 2 (04-12-2016)

hal-01349107 , version 3 (05-01-2017)

Identifiants

HAL Id : hal-01349107 , version 3
DOI : 10.1007/978-3-319-46493-0_45

Citer

Xiaojiang Peng, Cordelia Schmid. Multi-region two-stream R-CNN for action detection. ECCV - European Conference on Computer Vision, Oct 2016, Amsterdam, Netherlands. pp.744-759, ⟨10.1007/978-3-319-46493-0_45⟩. ⟨hal-01349107v3⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS INRIA LJK LJK_GI INRIA2 LJK-GI-THOTH

6053 Consultations

5953 Téléchargements

Multi-region two-stream R-CNN for action detection

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager