HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation

A semantic similarity measure for content-based classification of documents

Rim Al Hulou 1 Amedeo Napoli 1 Emmanuel Nauer 1
1 ORPAILLEUR - Knowledge representation, reasonning
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : In the framework of the Semantic Web, content-based processing of data is considered as an essential component for application semantic interoperability. In order to enable such a processing, many W3C standardisation proposals (rdf, rdfschema, owl, etc.) were done. In this paper, we present an approach for semantic similarity. It is based on content-based annotation of textual documents. The contents of documents are represented by labelled trees where nodes are represented by concepts in an ontology which represents the knowledge of the domain of data. Then, a reasoning process is carried out for comparing the labelled trees representing documents and thus comparing the documents. This comparison process, once completed, allows to calculate a semantic similarity measure between documents. Finally, we show how this semantic measure can be used to classify documents according to their content.
Document type :
Complete list of metadata

Contributor : Publications Loria Connect in order to contact the contributor
Submitted on : Tuesday, September 26, 2006 - 10:15:57 AM
Last modification on : Friday, February 26, 2021 - 3:28:05 PM


  • HAL Id : inria-00100236, version 1



Rim Al Hulou, Amedeo Napoli, Emmanuel Nauer. A semantic similarity measure for content-based classification of documents. [Intern report] A04-R-518 || al_hulou04c, 2004, 15 p. ⟨inria-00100236⟩



Record views