Un turc mécanique pour les ressources linguistiques : critique de la myriadisation du travail parcellisé

Abstract : This article is a position paper concerning Amazon Mechanical Turk-like systems, the use of which has been steadily growing in natural language processing in the past few years. According to the mainstream opinion expressed in the articles of the domain, these online working platforms allow to develop very quickly all sorts of quality language resources, for a very low price, by people doing that as a hobby. We shall demonstrate here that the situation is far from being that ideal, be it from the point of view of quality, price, workers' status or ethics. We shall then bring back to mind already existing or proposed alternatives. Our goal here is twofold : to inform researchers, so that they can make their own choices with all the elements of the reflection in mind, and propose practical and organizational solutions in order to improve new language resources development, while limiting the risks of ethical and legal issues without letting go price or quality.
Document type :
Conference papers
Complete list of metadatas

Cited literature [39 references]  Display  Hide  Download

https://hal.inria.fr/inria-00617067
Contributor : Benoît Sagot <>
Submitted on : Thursday, August 25, 2011 - 10:35:16 PM
Last modification on : Monday, September 16, 2019 - 11:45:22 AM
Long-term archiving on: Sunday, December 4, 2016 - 2:34:29 PM

File

TALN2011-MTurk.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00617067, version 1

Citation

Benoît Sagot, Karen Fort, Gilles Adda, Joseph Mariani, Bernard Lang. Un turc mécanique pour les ressources linguistiques : critique de la myriadisation du travail parcellisé. TALN'2011 - Traitement Automatique des Langues Naturelles, Jun 2011, Montpellier, France. ⟨inria-00617067⟩

Share

Metrics

Record views

1153

Files downloads

762