Probabilistic Principal Components and Mixtures, How This Works

Abstract : Classical Principal Components Analysis (PCA) is widely recognized as a method for dimensionality reduction and data visualization. This is a purely algebraic method, it considers just some optimization problem which fits exactly to the gathered data vectors with their particularities. No statistical significance tests are possible. An alternative is to use probabilistic principal component analysis (PPCA), which is formulated on a probabilistic ground. Obviously, to do it one has to know the probability distribution of the analyzed data. Usually the Multi-Variate Gaussian (MVG) distribution is assumed. But what, if the analyzed data are decidedly not MVG? We have met such problem when elaborating multivariate gearbox data derived from a heavy duty machine. We show here how we have dealt with the problem.In our analysis, we assumed that the considered data are a mixture of two groups being MVG, specifically: each of the sub-group follows a probabilistic principal component (PPC) distribution with a MVG error function. Then, by applying Bayesian inference, we were able to calculate for each data vector x its a posteriori probability of belonging to data generated by the assumed model. After estimation of the parameters of the assumed model we got means - based on a sound statistical basis - for constructing confidence boundaries of the data and finding outliers.
Type de document :
Communication dans un congrès
Khalid Saeed; Władysław Homenda. 14th Computer Information Systems and Industrial Management (CISIM), Sep 2015, Warsaw, Poland. Springer, Lecture Notes in Computer Science, LNCS-9339, pp.24-35, 2015, Computer Information Systems and Industrial Management. 〈10.1007/978-3-319-24369-6_2〉
Liste complète des métadonnées

https://hal.inria.fr/hal-01444479
Contributeur : Hal Ifip <>
Soumis le : mardi 24 janvier 2017 - 10:40:56
Dernière modification le : mercredi 25 janvier 2017 - 01:04:03
Document(s) archivé(s) le : mardi 25 avril 2017 - 14:22:19

Fichier

978-3-319-24369-6_2_Chapter.pd...
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

Citation

Anna Bartkowiak, Radoslaw Zimroz. Probabilistic Principal Components and Mixtures, How This Works. Khalid Saeed; Władysław Homenda. 14th Computer Information Systems and Industrial Management (CISIM), Sep 2015, Warsaw, Poland. Springer, Lecture Notes in Computer Science, LNCS-9339, pp.24-35, 2015, Computer Information Systems and Industrial Management. 〈10.1007/978-3-319-24369-6_2〉. 〈hal-01444479〉

Partager

Métriques

Consultations de la notice

24

Téléchargements de fichiers

12