Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use

Abstract : This article is a position paper about Amazon Mechanical Turk, the use of which has been steadily growing in language processing in the past few years. According to the mainstream opinion expressed in articles of the domain, this type of on-line working platforms allows to develop quickly all sorts of quality language resources, at a very low price, by people doing that as a hobby. We shall demonstrate here that the situation is far from being that ideal. Our goal here is manifold: 1- to inform researchers, so that they can make their own choices, 2- to develop alternatives with the help of funding agencies and scientific associations, 3- to propose practical and organizational solutions in order to improve language resources development, while limiting the risks of ethical and legal issues without letting go price or quality, 4- to introduce an Ethics and Big Data Charter for the documentation of language resource
Document type :
Book sections
Complete list of metadatas

Cited literature [41 references]  Display  Hide  Download

https://hal.inria.fr/hal-01053047
Contributor : Karën Fort <>
Submitted on : Tuesday, July 29, 2014 - 2:28:42 PM
Last modification on : Saturday, May 4, 2019 - 1:20:29 AM
Long-term archiving on : Tuesday, November 25, 2014 - 8:16:32 PM

File

LNAI_AMT_Finale.pdf
Files produced by the author(s)

Identifiers

Citation

Karen Fort, Gilles Adda, Benoît Sagot, Joseph Mariani, Alain Couillault. Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use. Vetulani, Zygmunt and Mariani, Joseph. Human Language Technology Challenges for Computer Science and Linguistics, 8387, Springer International Publishing, pp.303-314, 2014, Lecture Notes in Computer Science, 978-3-319-08957-7. ⟨10.1007/978-3-319-08958-4_25⟩. ⟨hal-01053047⟩

Share

Metrics

Record views

1114

Files downloads

831