Much Ado About Time: Exhaustive Annotation of Temporal Data

Gunnar A. Sigurdsson; Olga Russakovsky; Ali Farhadi; Ivan Laptev; Abhinav Gupta

Pré-Publication, Document De Travail Année : 2016

Much Ado About Time: Exhaustive Annotation of Temporal Data

(1, 2) , (1) , (3, 4) , (2) , (1, 4)

1
2
3
4

Gunnar A. Sigurdsson

Fonction : Auteur

Computer Science Department - Carnegie Mellon University

Models of visual object recognition and scene understanding

Olga Russakovsky

Fonction : Auteur

Computer Science Department - Carnegie Mellon University

Ali Farhadi

Fonction : Auteur

University of Washington [Seattle]

Allen Institute for Artificial Intelligence

Ivan Laptev

Fonction : Auteur

Models of visual object recognition and scene understanding

Abhinav Gupta

Fonction : Auteur

Computer Science Department - Carnegie Mellon University

Allen Institute for Artificial Intelligence

Résumé

Large-scale annotated datasets allow AI systems to learn from and build upon the knowledge of the crowd. Many crowdsourcing techniques have been developed for collecting image annotations. These techniques often implicitly rely on the fact that a new input image takes a negligible amount of time to perceive. In contrast, we investigate and determine the most cost-effective way of obtaining high-quality multi-label annotations for temporal data such as videos. Watching even a short 30-second video clip requires a significant time investment from a crowd worker; thus, requesting multiple annotations following a single viewing is an important cost-saving strategy. But how many questions should we ask per video? We conclude that the optimal strategy is to ask as many questions as possible in a HIT (up to 52 binary questions after watching a 30-second video clip in our experiments). We demonstrate that while workers may not correctly answer all questions, the cost-benefit analysis nevertheless favors consensus from multiple such cheap-yet-imperfect iterations over more complex alternatives. When compared with a one-question-per-video baseline, our method is able to achieve a 10% improvement in recall 76.7% ours versus 66.7% baseline) at comparable precision (83.8% ours versus 83.0% baseline) in about half the annotation time (3.8 minutes ours compared to 7.1 minutes baseline). We demonstrate the effectiveness of our method by collecting multi-label annotations of 157 human activities on 1,815 videos.

Domaines

Interface homme-machine [cs.HC] Vision par ordinateur et reconnaissance de formes [cs.CV]

Guilhem Chéron : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01431527

Soumis le : mercredi 11 janvier 2017-09:09:34

Dernière modification le : lundi 11 décembre 2023-11:30:31

Dates et versions

hal-01431527 , version 1 (11-01-2017)

Identifiants

HAL Id : hal-01431527 , version 1
ARXIV : 1607.07429

Citer

Gunnar A. Sigurdsson, Olga Russakovsky, Ali Farhadi, Ivan Laptev, Abhinav Gupta. Much Ado About Time: Exhaustive Annotation of Temporal Data. 2016. ⟨hal-01431527⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA INRIA2 PSL

150 Consultations

0 Téléchargements

Much Ado About Time: Exhaustive Annotation of Temporal Data

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager