A semantic similarity measure for content-based classification of documents

Rim Al Hulou; Amedeo Napoli; Emmanuel Nauer

Rapport Année : 2004

A semantic similarity measure for content-based classification of documents

(1) , (1) , (1)

Rim Al Hulou

Fonction : Auteur
PersonId : 831790

Knowledge representation, reasonning

Amedeo Napoli

Fonction : Auteur
PersonId : 743383
IdHAL : amedeo-napoli
IdRef : 034282297

Knowledge representation, reasonning

Emmanuel Nauer

Fonction : Auteur
PersonId : 175403
IdHAL : emmanuel-nauer
ORCID : 0000-0001-5756-0031
IdRef : 152295399

Knowledge representation, reasonning

Résumé

In the framework of the Semantic Web, content-based processing of data is considered as an essential component for application semantic interoperability. In order to enable such a processing, many W3C standardisation proposals (rdf, rdfschema, owl, etc.) were done. In this paper, we present an approach for semantic similarity. It is based on content-based annotation of textual documents. The contents of documents are represented by labelled trees where nodes are represented by concepts in an ontology which represents the knowledge of the domain of data. Then, a reasoning process is carried out for comparing the labelled trees representing documents and thus comparing the documents. This comparison process, once completed, allows to calculate a semantic similarity measure between documents. Finally, we show how this semantic measure can be used to classify documents according to their content.

Mots clés

document annotation similarité sémantique semantic similarity measure similarity tree classification hierarchy of classes arbre de similarité hiérarchie de classes annotation de document

Domaines

Autre [cs.OH]

Publications Loria : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00100236

Soumis le : mardi 26 septembre 2006-10:15:57

Dernière modification le : vendredi 24 mars 2023-14:52:48

Dates et versions

inria-00100236 , version 1 (26-09-2006)

Identifiants

HAL Id : inria-00100236 , version 1

Citer

Rim Al Hulou, Amedeo Napoli, Emmanuel Nauer. A semantic similarity measure for content-based classification of documents. [Intern report] A04-R-518 || al_hulou04c, 2004, 15 p. ⟨inria-00100236⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE INRIA2 LORIA LARA

86 Consultations

0 Téléchargements

A semantic similarity measure for content-based classification of documents

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager