An Introduction to symbolic data analysis

Abstract : The main aim of the symbolic approach in data analysis is to extend problems, methods and algorithms used on standard data to more complex data called "symbolic objects" in order to distinguish them from objects (described by numerical or categorical variables) treated by standard data nalysis methods. Symbolic objects extend classical objects of data analysis in two ways : first in case of individuals by giving the possibility of introducing in their definition, structured information, second, in case of sets or classes, by being intentionally defined. In both cases in order to represent uncertainty knowledge, it may be useful to use probabilities, possibilities (in case of vagueness and imprecision for instance) belief (in case of probabilities only known on parts and to express ignorance) that why, we introduce several kinds of symbolic objetcs : boolean, possibilist, probabilist and belief. We briefly present some of their qualities and properties, three theorems, show how probability, possibility and evidences theories may be extended on these objects. Some mixture decomposition problems on these objects are settled. We show that in some cases, fractals are well adapted to represent duality between symbolic objects. Sets of symbolic objects are represented by categories of different kinds (hierarchies, pyramids and lattices). Four kinds of data analysis problems including the symbolic extension are illustrated by several algorithms which induce knowledge from classical data or from a set of symbolic objects. Finally, important steps of a symbolic data analysis are described and illustrate by an example concerning road accidents.
Type de document :
Rapport
[Research Report] RR-1936, INRIA. 1993
Liste complète des métadonnées

https://hal.inria.fr/inria-00074738
Contributeur : Rapport de Recherche Inria <>
Soumis le : mercredi 24 mai 2006 - 16:12:31
Dernière modification le : vendredi 25 mai 2018 - 12:02:06
Document(s) archivé(s) le : mardi 12 avril 2011 - 15:51:26

Fichiers

Identifiants

  • HAL Id : inria-00074738, version 1

Collections

Citation

E. Diday. An Introduction to symbolic data analysis. [Research Report] RR-1936, INRIA. 1993. 〈inria-00074738〉

Partager

Métriques

Consultations de la notice

319

Téléchargements de fichiers

135