Combining Statistical Information and Semantic Similarity for Short Text Feature Extension

Abstract : A short text feature extension method combining statistical information and semantic similarity is proposed,Firstly, After defining the contribution of word, mutual information, an associated word-pairs set is generated by comparing the value of mutual information with threshold, then it is taken as the query words set to search for HowNet. For each word-pairs, senses are found in knowledge base HowNet, and semantic similarity of query word-pairs are calculated. Common sememe satisfied condition is added into the original term vector as extended feature, otherwise, semantic relationship is computed and the corresponding sememe is expanded into feature set. The above process is repeated, an extended feature set is finally obtained. Experimental results show the effectiveness of our method.
Document type :
Conference papers
Complete list of metadatas

Cited literature [9 references]  Display  Hide  Download

https://hal.inria.fr/hal-01614984
Contributor : Hal Ifip <>
Submitted on : Wednesday, October 11, 2017 - 4:57:32 PM
Last modification on : Wednesday, October 11, 2017 - 5:00:33 PM
Long-term archiving on : Friday, January 12, 2018 - 3:48:44 PM

File

433802_1_En_21_Chapter.pdf
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Citation

Xiaohong Li, Yun Su, Huifang Ma, Lin Cao. Combining Statistical Information and Semantic Similarity for Short Text Feature Extension. 9th International Conference on Intelligent Information Processing (IIP), Nov 2016, Melbourne, VIC, Australia. pp.205-210, ⟨10.1007/978-3-319-48390-0_21⟩. ⟨hal-01614984⟩

Share

Metrics

Record views

172

Files downloads

87