Skip to Main content Skip to Navigation

Text Clustering to Support Knowledge Acquisition from Documents

Stéphane Lapalut 1
1 ACACIA - Knowledge acquisition for aided design through agent interaction
CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : At the earlier stage of the knowledge acquisition process, interviews of experts produce a large amount of rich but ill-structured texts. Knowledge engineers need some tool to help them in the exploitation of all these texts. We propose the use of a statistical method, the top-down hierarchical classification and a new interpreta tion of its results. The initial statistical analysis proposed by M. Reinert (Reinert, 1979 and 1992) gives two kinds of results: first a segmentation of texts that reflects their «semantic contexts» that we use to raise structures of texts, and second, classes of significant terms belonging to these contexts, which can be related to the experts or to these specialities. In this paper, we describe the method, its empirical validity and its comparison with similar approaches, its uses with examples and results. We conclude with some research directions to deal with so-called "ontologies" on expert's domains.
Document type :
Complete list of metadata

Cited literature [19 references]  Display  Hide  Download
Contributor : Rapport de Recherche Inria Connect in order to contact the contributor
Submitted on : Wednesday, May 24, 2006 - 2:23:35 PM
Last modification on : Friday, February 4, 2022 - 3:24:02 AM
Long-term archiving on: : Sunday, April 4, 2010 - 9:37:36 PM


  • HAL Id : inria-00074051, version 1



Stéphane Lapalut. Text Clustering to Support Knowledge Acquisition from Documents. RR-2639, INRIA. 1995. ⟨inria-00074051⟩



Record views


Files downloads