Learning from Web Videos for Event Classification

Nicolas Chesneau 1 Karteek Alahari 1 Cordelia Schmid 1
1 Thoth - Apprentissage de modèles à partir de données massives
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann
Abstract : Traditional approaches for classifying event videos rely on a manually curated training dataset. While this paradigm has achieved excellent results on benchmarks such as TrecVid multimedia event detection (MED) challenge datasets, it is restricted by the effort involved in careful annotation. Recent approaches have attempted to address the need for annotation by automatically extracting images from the web, or generating queries to retrieve videos. In the former case, they fail to exploit additional cues provided by video data, while in the latter, they still require some manual annotation to generate relevant queries. We take an alternate approach in this paper, leveraging the synergy between visual video data and the associated textual metadata, to learn event classifiers without manually annotating any videos. Specifically, we first collect a video dataset with queries constructed automatically from textual description of events, prune irrelevant videos with text and video data, and then learn the corresponding event classifiers. We evaluate this approach in the challenging setting where no manually annotated training set is available, i.e., EK0 in the TrecVid challenge, and show state-of-the-art results on MED 2011 and 2013 datasets.
Type de document :
Article dans une revue
IEEE Transactions on Circuits and Systems for Video Technology, Institute of Electrical and Electronics Engineers, 2017, 〈10.1109/TCSVT.2017.2764624〉
Liste complète des métadonnées

Littérature citée [46 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01618400
Contributeur : Nicolas Chesneau <>
Soumis le : mardi 17 octobre 2017 - 19:55:03
Dernière modification le : mardi 21 novembre 2017 - 11:11:59

Fichier

final_version.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Nicolas Chesneau, Karteek Alahari, Cordelia Schmid. Learning from Web Videos for Event Classification. IEEE Transactions on Circuits and Systems for Video Technology, Institute of Electrical and Electronics Engineers, 2017, 〈10.1109/TCSVT.2017.2764624〉. 〈hal-01618400〉

Partager

Métriques

Consultations de la notice

50

Téléchargements de fichiers

15