K. Fort, G. Adda, and K. B. Cohen, Amazon mechanical turk: Gold mine or coal mine? Computational Linguistics, 2011.
DOI : 10.1162/coli_a_00057

URL : http://doi.org/10.1162/coli_a_00057

J. Ross, L. Irani, M. S. Silberman, A. Zaldivar, and B. Tomlinson, Who are the crowdworkers?, Proceedings of the 28th of the international conference extended abstracts on Human factors in computing systems, CHI EA '10, 2010.
DOI : 10.1145/1753846.1753873

P. Ipeirotis, Demographics of mechanical turk. CeDER Working Papers, 2010.

L. Biewald, Better crowdsourcing through automated methods for quality control, SIGIR 2010 Workshop on Crowdsourcing for Search Evaluation, 2010.

M. S. Silberman, J. Ross, L. Irani, and B. Tomlinson, Sellers' problems in human computation markets, Proceedings of the ACM SIGKDD Workshop on Human Computation, HCOMP '10, pp.10-18, 2010.
DOI : 10.1145/1837885.1837891

J. Ross, A. Zaldivar, L. Irani, and B. Tomlinson, Who are the turkers? worker demographics in amazon mechanical turk, 2009.

L. B. Chilton, J. J. Horton, R. C. Miller, and S. Azenkot, Task search in a human computation market, Proceedings of the ACM SIGKDD Workshop on Human Computation, HCOMP '10, pp.10-11, 2010.
DOI : 10.1145/1837885.1837889

G. Adda and J. Mariani, Language resources and amazon mechanical turk: legal, ethical and other issues Legal Issues for Sharing Language Resources workshop, LREC 2010, 2010.

S. Novotney and C. Callison-burch, Cheap, fast and good enough: automatic speech recognition with non-expert transcription, Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. HLT '10, pp.207-215, 2010.

C. Callison-burch and M. Dredze, Creating speech and language data with amazon's mechanical turk, CSLDAMT '10: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, 2010.

M. Kaisser and J. B. Lowe, Creating a research collection of question answer sentence pairs with amazon's mechanical turk, Proceedings of the International Language Resources and Evaluation Conference (LREC), 2008.

F. Xu and D. Klakow, Paragraph acquisition and selection for list question using amazon's mechanical turk, Proceedings of the International Language Resources and Evaluation Conference (LREC), pp.2340-2345, 2010.

M. Marge, S. Banerjee, and A. I. Rudnicky, Using the Amazon Mechanical Turk for transcription of spoken language, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.14-19, 2010.
DOI : 10.1109/ICASSP.2010.5494979

P. Cook and S. Stevenson, Automatically identifying changes in the semantic orientation of words, Proceedings of the International Language Resources and Evaluation Conference (LREC), 2010.

V. Bhardwaj, R. Passonneau, A. Salleb-aouissi, and N. Ide, Anveshan: A tool for analysis of multiple annotators' labeling behavior, Proceedings of The fourth linguistic annotation workshop (LAW IV), 2010.

R. Snow, B. O-'connor, D. Jurafsky, and A. Y. Ng, Cheap and fast -but is it good? evaluating non-expert annotations for natural language tasks, Proceedings of EMNLP 2008, pp.254-263, 2008.

D. Gillick and Y. Liu, Non-expert evaluation of summarization systems is risky, Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk. CSLDAMT '10, 2010.

S. Tratz and E. Hovy, A taxonomy, dataset, and classifier for automatic noun compound interpretation, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp.678-687, 2010.

P. Wais, S. Lingamneni, D. Cook, J. Fennell, B. Goldenberg et al., Towards building a high-qualityworkforce with mechanical turk, Proceedings of Computational Social Science and the Wisdom of Crowds (NIPS), 2010.

S. Kochhar, S. Mazzocchi, and P. Paritosh, The anatomy of a large-scale human computation engine, Proceedings of the ACM SIGKDD Workshop on Human Computation, HCOMP '10, 2010.
DOI : 10.1145/1837885.1837890

S. Goldwater and T. Griffiths, A fully bayesian approach to unsupervised part-ofspeech tagging, Proceedings of ACL, 2007.

C. Hänig, Improvements in unsupervised co-occurrence based parsing, Proceedings of the Fourteenth Conference on Computational Natural Language Learning . CoNLL '10, pp.1-8, 2010.

S. Abney, Semisupervised Learning for Computational Linguistics, 2007.
DOI : 10.1201/9781420010800

D. Yarowsky, Unsupervised word sense disambiguation rivaling supervised methods, Proceedings of the 33rd annual meeting on Association for Computational Linguistics -, pp.189-196, 1995.
DOI : 10.3115/981658.981684

URL : http://acl.ldc.upenn.edu/P/P95/P95-1026.pdf

A. Blum and T. Mitchell, Combining labeled and unlabeled data with co-training, Proceedings of the eleventh annual conference on Computational learning theory , COLT' 98, 1998.
DOI : 10.1145/279943.279962

URL : http://axon.cs.byu.edu/~martinez/classes/678/Papers/Mitchell_cotraining.pdf

D. A. Cohn, Z. Ghahramani, and M. I. Jordan, Active learning with statistical models, Advances in Neural Information Processing Systems, pp.705-712, 1995.

N. Smith and J. Eisner, Contrastive estimation, Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics , ACL '05, pp.354-362, 2005.
DOI : 10.3115/1219840.1219884

B. Sagot, Automatic Acquisition of a Slovak Lexicon from a Raw Corpus, Proceedings of TSD'05, pp.156-163, 2005.
DOI : 10.1007/11551874_20

R. Watson, T. Briscoe, and J. Carroll, Semi-supervised training of a statistical parser from unlabeled partially-bracketed data, Proceedings of the 10th International Conference on Parsing Technologies. IWPT '07, 2007.

K. Fort and B. Sagot, Influence of Pre-annotation on POS-tagged Corpus Development, Proc. of the Fourth ACL Linguistic Annotation Workshop, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00484294

K. Erk, A. Kowalski, and S. Pado, The salsa annotation tool, Proceedings of the Workshop on Prospects and Advances in the Syntax/Semantics Interface, 2003.

M. Yetisgen-yildiz, I. Solti, F. Xia, and S. R. Halgrim, Preliminary experience with amazon's mechanical turk for annotating medical named entities, Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk. CSLDAMT '10, pp.180-183, 2010.

T. Finin, W. Murnane, A. Karandikar, N. Keller, J. Martineau et al., Annotating named entities in twitter data with crowdsourcing, Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk. CSLDAMT '10, 2010.

N. Lawson, K. Eustice, M. Perkowitz, and M. Yetisgen-yildiz, Annotating large email datasets for named entity recognition with mechanical turk, Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk. CSLDAMT '10, pp.71-79, 2010.

J. Nothman, J. R. Curran, and T. Murphy, Transforming Wikipedia into Named Entity Training Data, Proceedings of the Australian Language Technology Workshop, 2008.

D. Balasuriya, N. Ringland, J. Nothman, T. Murphy, and J. R. Curran, Named entity recognition in Wikipedia, Proceedings of the 2009 Workshop on The People's Web Meets NLP Collaboratively Constructed Semantic Resources, People's Web '09, pp.10-18, 2009.
DOI : 10.3115/1699765.1699767

M. Stürenberg, D. Goecke, N. Die-wald, I. Cramer, and A. Mehler, Web-based annotation of anaphoric relations and lexical chains, Proceedings of the Linguistic Annotation Workshop on, LAW '07, 2007.
DOI : 10.3115/1642059.1642082

L. Von-ahn, Games with a Purpose, Computer, vol.39, issue.6, pp.96-98, 2006.
DOI : 10.1109/MC.2006.196

J. Chamberlain, M. Poesio, and U. Kruschwitz, Phrase Detectives: a Web-based Collaborative Annotation Game, Proceedings of the International Conference on Semantic Systems (I-Semantics'08), 2008.

T. Hughes, K. Nakajima, L. Ha, A. Vasu, P. Moreno et al., Building transcribed speech corpora quickly and cheaply for many languages, Proceedings of Interspeech, pp.1914-1917, 2010.

A. Couillault and K. Fort, Charte Éthique et Big Data : parce que mon corpus le vaut bien ! In: Linguistique, Langues et Parole : Statuts, Usages et Mésusages, 2013.