Skip to Main content Skip to Navigation
New interface
Reports (Research report)

An Introduction to symbolic data analysis

Abstract : The main aim of the symbolic approach in data analysis is to extend problems, methods and algorithms used on standard data to more complex data called "symbolic objects" in order to distinguish them from objects (described by numerical or categorical variables) treated by standard data nalysis methods. Symbolic objects extend classical objects of data analysis in two ways : first in case of individuals by giving the possibility of introducing in their definition, structured information, second, in case of sets or classes, by being intentionally defined. In both cases in order to represent uncertainty knowledge, it may be useful to use probabilities, possibilities (in case of vagueness and imprecision for instance) belief (in case of probabilities only known on parts and to express ignorance) that why, we introduce several kinds of symbolic objetcs : boolean, possibilist, probabilist and belief. We briefly present some of their qualities and properties, three theorems, show how probability, possibility and evidences theories may be extended on these objects. Some mixture decomposition problems on these objects are settled. We show that in some cases, fractals are well adapted to represent duality between symbolic objects. Sets of symbolic objects are represented by categories of different kinds (hierarchies, pyramids and lattices). Four kinds of data analysis problems including the symbolic extension are illustrated by several algorithms which induce knowledge from classical data or from a set of symbolic objects. Finally, important steps of a symbolic data analysis are described and illustrate by an example concerning road accidents.
Document type :
Reports (Research report)
Complete list of metadata
Contributor : Rapport De Recherche Inria Connect in order to contact the contributor
Submitted on : Wednesday, May 24, 2006 - 4:12:31 PM
Last modification on : Wednesday, October 26, 2022 - 8:16:49 AM
Long-term archiving on: : Tuesday, April 12, 2011 - 3:51:26 PM


  • HAL Id : inria-00074738, version 1



Edwin Diday. An Introduction to symbolic data analysis. [Research Report] RR-1936, INRIA. 1993. ⟨inria-00074738⟩



Record views


Files downloads