Lexical descriptions for Vietnamese language processing

Thanh Bon Nguyen Thi Minh Huyen Nguyen 1 Laurent Romary 1 Xuan Luong Vu
1 LANGUE ET DIALOGUE - Human-machine dialogue with a significant language component
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Only very recently have Vietnamese re-searchers begun to be involved in the do-main of Natural Language Processing. As there does not exist any published work in formal linguistics or any recognizable standard for Vietnamese word categories, the fundamental works in Vietnamese text analysis such as part-of-speech tagging, parsing, etc. are very difficult tasks for computer scientists. All necessary linguistic resources have to be built from scratch, and until now almost no re-sources are shared in public research. The aim of our project is to build a common linguistic database that is freely and easily exploitable for the automatic processing of Vietnamese. In this paper, we propose an extensible set of Vietnamese syntactic descriptions that can be used for tagset definition and corpus annotation. These descriptors are established in such a way to be a reference set proposal for Vietnamese in the context of ISO subcommit-tee TC37/SC4 (Language Resource Management).
Type de document :
Communication dans un congrès
The 1st International Joint Conference on Natural Language Processing - IJCNLP'04 / Workshop on Asian Language Resources, 2004, Sanya, Hainan Island, China, 8 p, 2004
Liste complète des métadonnées

Littérature citée [15 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00107760
Contributeur : Laurent Romary <>
Soumis le : lundi 1 décembre 2008 - 09:40:46
Dernière modification le : jeudi 11 janvier 2018 - 06:19:48
Document(s) archivé(s) le : mardi 6 avril 2010 - 20:10:01

Identifiants

  • HAL Id : inria-00107760, version 1

Collections

Citation

Thanh Bon Nguyen, Thi Minh Huyen Nguyen, Laurent Romary, Xuan Luong Vu. Lexical descriptions for Vietnamese language processing. The 1st International Joint Conference on Natural Language Processing - IJCNLP'04 / Workshop on Asian Language Resources, 2004, Sanya, Hainan Island, China, 8 p, 2004. 〈inria-00107760〉

Partager

Métriques

Consultations de la notice

358

Téléchargements de fichiers

910