Skip to Main content Skip to Navigation
Conference papers

Bayesian variable selection for probit mixed models

Abstract : In computational biology, gene expression datasets are characterized by very few individual samples compared to a large number of measurments per sample. Thus, it is appealing to merge these datasets in order to increase the number of observations and diversify the data, allowing a more reliable selection of genes relevant to the biological problem. This necessitates the introduction of the dataset as a random effect. Extending previous work of Lee et al. (2003), a method is proposed to select relevant variables among tens of thousands in a probit mixed regression model, considered as part of a larger hierarchical Bayesian model. Latent variables are used to identify subsets of selected variables and the collapsing technique of Liu (1994) is combined with a Metropolis-within-Gibbs algorithm (Robert and Casella, 2004). The method is applied to a merged dataset made of three individual gene expression datasets, in which tens of thousands of measurements are available for each of several hundred human breast cancer samples. Even for this large dataset comprised of around 20000 predictors, the method is shown to be efficient and feasible. As a demonstration, it is used to select the most important genes that characterize the estrogen receptor status of the cancer patients.
Complete list of metadata

Cited literature [2 references]  Display  Hide  Download
Contributor : Conférence Sfds-Hal Connect in order to contact the contributor
Submitted on : Thursday, June 24, 2010 - 8:58:30 AM
Last modification on : Tuesday, March 30, 2021 - 3:15:36 AM
Long-term archiving on: : Monday, October 22, 2012 - 2:46:50 PM


Files produced by the author(s)


  • HAL Id : inria-00494781, version 1



Meïli Baragatti. Bayesian variable selection for probit mixed models. 42èmes Journées de Statistique, 2010, Marseille, France, France. ⟨inria-00494781⟩



Record views


Files downloads