Text Clustering to Support Knowledge Acquisition from Documents - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Rapport Année : 1995

Text Clustering to Support Knowledge Acquisition from Documents

Résumé

At the earlier stage of the knowledge acquisition process, interviews of experts produce a large amount of rich but ill-structured texts. Knowledge engineers need some tool to help them in the exploitation of all these texts. We propose the use of a statistical method, the top-down hierarchical classification and a new interpreta tion of its results. The initial statistical analysis proposed by M. Reinert (Reinert, 1979 and 1992) gives two kinds of results: first a segmentation of texts that reflects their «semantic contexts» that we use to raise structures of texts, and second, classes of significant terms belonging to these contexts, which can be related to the experts or to these specialities. In this paper, we describe the method, its empirical validity and its comparison with similar approaches, its uses with examples and results. We conclude with some research directions to deal with so-called "ontologies" on expert's domains.
Fichier principal
Vignette du fichier
RR-2639.pdf (227.22 Ko) Télécharger le fichier
Loading...

Dates et versions

inria-00074051 , version 1 (24-05-2006)

Identifiants

  • HAL Id : inria-00074051 , version 1

Citer

Stéphane Lapalut. Text Clustering to Support Knowledge Acquisition from Documents. RR-2639, INRIA. 1995. ⟨inria-00074051⟩
81 Consultations
344 Téléchargements

Partager

Gmail Facebook X LinkedIn More