Skip to Main content Skip to Navigation

Classification croisée pour l’analyse de bases de données de grandes dimensions de pharmacovigilance

Abstract : This thesis gathers methodological contributions to the statistical analysis of large datasets in pharmacovigilance. The pharmacovigilance datasets produce sparse and large matrices and these two characteritics are the main statistical challenges for modelling them. The first part of the thesis is dedicated to the coclustering of the pharmacovigilance contingency table thanks to the normalized Poisson latent block model. The objective is on the one hand, to provide pharmacologists with some interesting and reduced areas to explore more precisely. On the other hand, this coclustering remains a useful background information for dealing with individual database. Within this framework, a parameter estimation procedure for this model is detailed and objective model selection criteria are developed to choose the best fit model. Datasets are so large that we propose a procedure to explore the model space in coclustering, in a non exhaustive way but in a relevant one. Additionnally, to assess the performances of the methods, a convenient coclustering index is developed to compare partitions with high numbers of clusters. The development of these statistical tools are not specific to pharmacovigilance and can be used for any coclustering issue. The second part of the thesis is devoted to the statistical ana- lysis of the large individual data, which are more numerous but also provides even more valuable information. The aim is to produce individual clusters according their drug profiles and subgroups of drugs and adverse effects with possible links, which overcomes the coprescription and masking phenomenons, common contingency table issues in pharmacovigilance. Moreover, the interaction between several adverse effects is taken into account. For this purpose, we propose a new model, the Multiple Latent Block Model (MLBM) which enables to cocluster two binary tables by imposing the same row ranking. Assertions inherent to the model are discussed and sufficient identifiability conditions for the model are presented. Then a parameter estimation algorithm is studied and objective model selection criteria are developed. Moreover, a numeric simulation model of the individual data is proposed to compare existing methods and study its limits. Finally, the proposed methodology to deal with individual pharmacovigilance data is presented and applied to a sample of the French pharmacovigilance database between 2002 and 2010.
Document type :
Complete list of metadata

Cited literature [50 references]  Display  Hide  Download
Contributor : Valérie Robert Connect in order to contact the contributor
Submitted on : Monday, January 29, 2018 - 2:53:26 PM
Last modification on : Saturday, June 25, 2022 - 10:28:55 PM
Long-term archiving on: : Friday, May 25, 2018 - 8:22:22 AM


Files produced by the author(s)


  • HAL Id : tel-01695568, version 1


Valérie Robert. Classification croisée pour l’analyse de bases de données de grandes dimensions de pharmacovigilance. Applications [stat.AP]. Université Paris Saclay; Université Paris Sud - Orsay, 2017. Français. ⟨tel-01695568⟩



Record views


Files downloads