Skip to Main content Skip to Navigation
Reports

Text Clustering to Support Knowledge Acquisition from Documents

Stéphane Lapalut 1
1 ACACIA - Knowledge acquisition for aided design through agent interaction
CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : At the earlier stage of the knowledge acquisition process, interviews of experts produce a large amount of rich but ill-structured texts. Knowledge engineers need some tool to help them in the exploitation of all these texts. We propose the use of a statistical method, the top-down hierarchical classification and a new interpreta tion of its results. The initial statistical analysis proposed by M. Reinert (Reinert, 1979 and 1992) gives two kinds of results: first a segmentation of texts that reflects their «semantic contexts» that we use to raise structures of texts, and second, classes of significant terms belonging to these contexts, which can be related to the experts or to these specialities. In this paper, we describe the method, its empirical validity and its comparison with similar approaches, its uses with examples and results. We conclude with some research directions to deal with so-called "ontologies" on expert's domains.
Document type :
Reports
Complete list of metadata

Cited literature [19 references]  Display  Hide  Download

https://hal.inria.fr/inria-00074051
Contributor : Rapport de Recherche Inria <>
Submitted on : Wednesday, May 24, 2006 - 2:23:35 PM
Last modification on : Monday, October 19, 2020 - 11:07:45 AM
Long-term archiving on: : Sunday, April 4, 2010 - 9:37:36 PM

Identifiers

  • HAL Id : inria-00074051, version 1

Collections

Citation

Stéphane Lapalut. Text Clustering to Support Knowledge Acquisition from Documents. RR-2639, INRIA. 1995. ⟨inria-00074051⟩

Share

Metrics

Record views

151

Files downloads

522