Skip to Main content Skip to Navigation
Conference papers

An Efficient Semantic-Based Organization and Similarity Search Method for Internet Data Resources

Abstract : A large number of data resources with different types are appearing in the internet with the development of information technology, and some negative ones have done harm to our society and citizens. In order to insure the harmony of the society, it is important to discovery the bad resources from the heterogeneous massive data resources in the cyberspace, the internet resource discovery has attracted increasing attention. In this paper, we present the iHash method, a semantic-based organization and similarity search method for internet data resources. First, the iHash normalizes the internet data objects into a high-dimensional feature space, solving the “feature explosion” problem of the feature space; second, we partition the high-dimensional data in the feature space according to clustering method, transform the data clusters into regular shapes, and use the Pyramid-similar method to organize the high-dimensional data; finally, we realize the range and kNN queries based on our method. At last we discuss the performance evaluation of the iHash method and find it performs efficiently for similarity search.
Complete list of metadata

Cited literature [18 references]  Display  Hide  Download
Contributor : Hal Ifip Connect in order to contact the contributor
Submitted on : Tuesday, November 15, 2016 - 4:08:38 PM
Last modification on : Tuesday, September 3, 2019 - 3:04:02 PM
Long-term archiving on: : Thursday, March 16, 2017 - 4:40:20 PM


Files produced by the author(s)


Distributed under a Creative Commons Attribution 4.0 International License



Peige Ren, Xiaofeng Wang, Hao Sun, Baokang Zhao, Chunqing Wu. An Efficient Semantic-Based Organization and Similarity Search Method for Internet Data Resources. 2nd Information and Communication Technology - EurAsia Conference (ICT-EurAsia), Apr 2014, Bali, Indonesia. pp.663-673, ⟨10.1007/978-3-642-55032-4_68⟩. ⟨hal-01397284⟩



Record views


Files downloads