Identifying Redundancy and Exposing Provenance in Crowdsourced Data Analysis

Wesley Willett; Shiry Ginosar; Avital Steinitz; Björn Hartmann; Maneesh Agrawala

doi:10.1109/TVCG.2013.164

Article Dans Une Revue IEEE Transactions on Visualization and Computer Graphics Année : 2013

Identifying Redundancy and Exposing Provenance in Crowdsourced Data Analysis

(1) , (2) , (2) , (2) , (2)

1
2

Wesley Willett

Fonction : Auteur correspondant
PersonId : 944334

Connectez-vous pour contacter l'auteur

Analysis and Visualization

Shiry Ginosar

Fonction : Auteur
PersonId : 945895

Computer Science Division [Berkeley]

Avital Steinitz

Fonction : Auteur
PersonId : 945896

Computer Science Division [Berkeley]

Björn Hartmann

Fonction : Auteur
PersonId : 945897

Computer Science Division [Berkeley]

Maneesh Agrawala

Fonction : Auteur
PersonId : 945898

Computer Science Division [Berkeley]

Résumé

We present a system that lets analysts use paid crowd workers to explore data sets and helps analysts interactively examine and build upon workers' insights. We take advantage of the fact that, for many types of data, independent crowd workers can readily perform basic analysis tasks like examining views and generating explanations for trends and patterns. However, workers operating in parallel can often generate redundant explanations. Moreover, because workers have different competencies and domain knowledge, some responses are likely to be more plausible than others. To efficiently utilize the crowd's work, analysts must be able to quickly identify and consolidate redundant responses and determine which explanations are the most plausible. In this paper, we demonstrate several crowd-assisted techniques to help analysts make better use of crowdsourced explanations: (1) We explore crowd-assisted strategies that utilize multiple workers to detect redundant explanations. We introduce color clustering with representative selection--a strategy in which multiple workers cluster explanations and we automatically select the most-representative result--and show that it generates clusterings that are as good as those produced by experts. (2) We capture explanation provenance by introducing highlighting tasks and capturing workers' browsing behavior via an embedded web browser, and refine that provenance information via source-review tasks. We expose this information in an explanation-management interface that allows analysts to interactively filter and sort responses, select the most plausible explanations, and decide which to explore further.

Mots clés

Crowdsourcing Social Data Analysis

Domaines

Interface homme-machine [cs.HC]

Fichier principal

CrowdAnalytics-VAST2013-FinalCameraReady.pdf (1.78 Mo)

CrowdAnalytics-VAST-Thumb.png (18.25 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Format : Figure, Image

Wesley Willett : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00850827

Soumis le : vendredi 11 octobre 2013-07:00:17

Dernière modification le : mercredi 14 février 2024-03:09:29

Archivage à long terme le : mercredi 5 avril 2017-20:26:16

Dates et versions

hal-00850827 , version 1 (11-10-2013)

Identifiants

HAL Id : hal-00850827 , version 1
DOI : 10.1109/TVCG.2013.164

Citer

Wesley Willett, Shiry Ginosar, Avital Steinitz, Björn Hartmann, Maneesh Agrawala. Identifying Redundancy and Exposing Provenance in Crowdsourced Data Analysis. IEEE Transactions on Visualization and Computer Graphics, 2013, 19 (12), pp.2198 - 2206. ⟨10.1109/TVCG.2013.164⟩. ⟨hal-00850827⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS CNRS INRIA UMR8623 INRIA2 UNIV-PARIS-SACLAY LISN LISN-AVIZ

554 Consultations

383 Téléchargements

Identifying Redundancy and Exposing Provenance in Crowdsourced Data Analysis

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager