Lexical descriptions for Vietnamese language processing - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2004

Lexical descriptions for Vietnamese language processing

Thanh Bon Nguyen
  • Fonction : Auteur
  • PersonId : 835933
Xuan Luong Vu
  • Fonction : Auteur

Résumé

Only very recently have Vietnamese re-searchers begun to be involved in the do-main of Natural Language Processing. As there does not exist any published work in formal linguistics or any recognizable standard for Vietnamese word categories, the fundamental works in Vietnamese text analysis such as part-of-speech tagging, parsing, etc. are very difficult tasks for computer scientists. All necessary linguistic resources have to be built from scratch, and until now almost no re-sources are shared in public research. The aim of our project is to build a common linguistic database that is freely and easily exploitable for the automatic processing of Vietnamese. In this paper, we propose an extensible set of Vietnamese syntactic descriptions that can be used for tagset definition and corpus annotation. These descriptors are established in such a way to be a reference set proposal for Vietnamese in the context of ISO subcommit-tee TC37/SC4 (Language Resource Management).
Fichier principal
Vignette du fichier
A04-R-031.pdf (218.47 Ko) Télécharger le fichier
Loading...

Dates et versions

inria-00107760 , version 1 (01-12-2008)

Identifiants

  • HAL Id : inria-00107760 , version 1

Citer

Thanh Bon Nguyen, Thi Minh Huyen Nguyen, Laurent Romary, Xuan Luong Vu. Lexical descriptions for Vietnamese language processing. The 1st International Joint Conference on Natural Language Processing - IJCNLP'04 / Workshop on Asian Language Resources, 2004, Sanya, Hainan Island, China, 8 p. ⟨inria-00107760⟩
386 Consultations
2126 Téléchargements

Partager

Gmail Facebook X LinkedIn More