A semantic similarity measure for content-based classification of documents - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Rapport Année : 2004

A semantic similarity measure for content-based classification of documents

Rim Al Hulou
  • Fonction : Auteur
  • PersonId : 831790
Amedeo Napoli
Emmanuel Nauer

Résumé

In the framework of the Semantic Web, content-based processing of data is considered as an essential component for application semantic interoperability. In order to enable such a processing, many W3C standardisation proposals (rdf, rdfschema, owl, etc.) were done. In this paper, we present an approach for semantic similarity. It is based on content-based annotation of textual documents. The contents of documents are represented by labelled trees where nodes are represented by concepts in an ontology which represents the knowledge of the domain of data. Then, a reasoning process is carried out for comparing the labelled trees representing documents and thus comparing the documents. This comparison process, once completed, allows to calculate a semantic similarity measure between documents. Finally, we show how this semantic measure can be used to classify documents according to their content.
Fichier non déposé

Dates et versions

inria-00100236 , version 1 (26-09-2006)

Identifiants

  • HAL Id : inria-00100236 , version 1

Citer

Rim Al Hulou, Amedeo Napoli, Emmanuel Nauer. A semantic similarity measure for content-based classification of documents. [Intern report] A04-R-518 || al_hulou04c, 2004, 15 p. ⟨inria-00100236⟩
86 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More