Building a treebank of noisy user-generated content: The French Social Media Bank

Abstract : We introduce the French Social Media Bank, the first user-generated content treebank for French. Its first release contains 1,700 sentences from various Web 2.0 and social media sources (FACEBOOK, TWITTER, web forums), including data specifically chosen for their high noisiness.
Type de document :
Communication dans un congrès
TLT 11 - The 11th International Workshop on Treebanks and Linguistic Theories, Nov 2012, Lisbonne, Portugal. 2012, 〈http://pauillac.inria.fr/~seddah/tlt2012.pdf〉
Liste complète des métadonnées

https://hal.inria.fr/hal-00780898
Contributeur : Djamé Seddah <>
Soumis le : vendredi 25 janvier 2013 - 01:48:14
Dernière modification le : samedi 9 juin 2018 - 10:30:03

Identifiants

  • HAL Id : hal-00780898, version 1

Collections

Citation

Djamé Seddah, Benoît Sagot, Marie Candito, Virginie Mouilleron, Vanessa Combet. Building a treebank of noisy user-generated content: The French Social Media Bank. TLT 11 - The 11th International Workshop on Treebanks and Linguistic Theories, Nov 2012, Lisbonne, Portugal. 2012, 〈http://pauillac.inria.fr/~seddah/tlt2012.pdf〉. 〈hal-00780898〉

Partager

Métriques

Consultations de la notice

241