Skip to Main content Skip to Navigation
Theses

Classification croisée pour l’analyse de bases de données de grandes dimensions de pharmacovigilance

Abstract : This thesis gathers methodological contributions to the statistical analysis of large datasets in pharmacovigilance. The pharmacovigilance datasets produce sparse and large matrices and these two characteritics are the main statistical challenges for modelling them. The first part of the thesis is dedicated to the coclustering of the pharmacovigilance contingency table thanks to the normalized Poisson latent block model. The objective is on the one hand, to provide pharmacologists with some interesting and reduced areas to explore more precisely. On the other hand, this coclustering remains a useful background information for dealing with individual database. Within this framework, a parameter estimation procedure for this model is detailed and objective model selection criteria are developed to choose the best fit model. Datasets are so large that we propose a procedure to explore the model space in coclustering, in a non exhaustive way but in a relevant one. Additionnally, to assess the performances of the methods, a convenient coclustering index is developed to compare partitions with high numbers of clusters. The development of these statistical tools are not specific to pharmacovigilance and can be used for any coclustering issue. The second part of the thesis is devoted to the statistical ana- lysis of the large individual data, which are more numerous but also provides even more valuable information. The aim is to produce individual clusters according their drug profiles and subgroups of drugs and adverse effects with possible links, which overcomes the coprescription and masking phenomenons, common contingency table issues in pharmacovigilance. Moreover, the interaction between several adverse effects is taken into account. For this purpose, we propose a new model, the Multiple Latent Block Model (MLBM) which enables to cocluster two binary tables by imposing the same row ranking. Assertions inherent to the model are discussed and sufficient identifiability conditions for the model are presented. Then a parameter estimation algorithm is studied and objective model selection criteria are developed. Moreover, a numeric simulation model of the individual data is proposed to compare existing methods and study its limits. Finally, the proposed methodology to deal with individual pharmacovigilance data is presented and applied to a sample of the French pharmacovigilance database between 2002 and 2010.
Document type :
Theses
Complete list of metadata

Cited literature [50 references]  Display  Hide  Download

https://hal.inria.fr/tel-01695568
Contributor : Valérie Robert <>
Submitted on : Monday, January 29, 2018 - 2:53:26 PM
Last modification on : Friday, April 30, 2021 - 9:55:59 AM
Long-term archiving on: : Friday, May 25, 2018 - 8:22:22 AM

File

Dissertation.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-01695568, version 1

Citation

Valérie Robert. Classification croisée pour l’analyse de bases de données de grandes dimensions de pharmacovigilance. Applications [stat.AP]. Université Paris Saclay; Université Paris Sud - Orsay, 2017. Français. ⟨tel-01695568⟩

Share

Metrics

Record views

276

Files downloads

282