Text Clustering to Support Knowledge Acquisition from Documents

Stéphane Lapalut 1
1 ACACIA - Knowledge acquisition for aided design through agent interaction
CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : At the earlier stage of the knowledge acquisition process, interviews of experts produce a large amount of rich but ill-structured texts. Knowledge engineers need some tool to help them in the exploitation of all these texts. We propose the use of a statistical method, the top-down hierarchical classification and a new interpreta tion of its results. The initial statistical analysis proposed by M. Reinert (Reinert, 1979 and 1992) gives two kinds of results: first a segmentation of texts that reflects their «semantic contexts» that we use to raise structures of texts, and second, classes of significant terms belonging to these contexts, which can be related to the experts or to these specialities. In this paper, we describe the method, its empirical validity and its comparison with similar approaches, its uses with examples and results. We conclude with some research directions to deal with so-called "ontologies" on expert's domains.
Type de document :
RR-2639, INRIA. 1995
Liste complète des métadonnées

Littérature citée [19 références]  Voir  Masquer  Télécharger

Contributeur : Rapport de Recherche Inria <>
Soumis le : mercredi 24 mai 2006 - 14:23:35
Dernière modification le : samedi 27 janvier 2018 - 01:31:29
Document(s) archivé(s) le : dimanche 4 avril 2010 - 21:37:36



  • HAL Id : inria-00074051, version 1



Stéphane Lapalut. Text Clustering to Support Knowledge Acquisition from Documents. RR-2639, INRIA. 1995. 〈inria-00074051〉



Consultations de la notice


Téléchargements de fichiers