A General Approach to Extracting Full Names and Abbreviations for Chinese Entities from the Web

Abstract : Identifying Full names/abbreviations for entities is a challenging problem in many applications, e.g. question answering and information retrieval. In this paper, we propose a general extraction method of extracting full names/abbreviations from Chinese Web corpora. For a given entity, we construct forward and backward query items and commit them to a search engine (e.g. Google), and utilize search results to extract full names and abbreviations for the entity. To verify the results, filtering and marking methods are used to sort all the results. Experiments show that our method achieves precision of 84.7% for abbreviations, and 77.0% for full names.
Type de document :
Communication dans un congrès
Zhongzhi Shi; Sunil Vadera; Agnar Aamodt; David Leake. 6th IFIP TC 12 International Conference on Intelligent Information Processing (IIP), Oct 2010, Manchester, United Kingdom. Springer, IFIP Advances in Information and Communication Technology, AICT-340, pp.271-280, 2010, Intelligent Information Processing V. 〈10.1007/978-3-642-16327-2_33〉
Liste complète des métadonnées

Littérature citée [11 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01060363
Contributeur : Hal Ifip <>
Soumis le : mardi 21 novembre 2017 - 16:40:38
Dernière modification le : mercredi 22 novembre 2017 - 01:22:18

Fichier

JiangCYLW10.pdf
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

Citation

Guang Jiang, Cao Cungen, Sui Yuefei, Han Lu, Shi Wang. A General Approach to Extracting Full Names and Abbreviations for Chinese Entities from the Web. Zhongzhi Shi; Sunil Vadera; Agnar Aamodt; David Leake. 6th IFIP TC 12 International Conference on Intelligent Information Processing (IIP), Oct 2010, Manchester, United Kingdom. Springer, IFIP Advances in Information and Communication Technology, AICT-340, pp.271-280, 2010, Intelligent Information Processing V. 〈10.1007/978-3-642-16327-2_33〉. 〈hal-01060363〉

Partager

Métriques

Consultations de la notice

128

Téléchargements de fichiers

21