Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm

Marta Avalos 1, 2 Hélène Pouyes 1 Yves Grandvalet 3 Ludivine Orriols 4, 1 Emmanuel Lagarde 5
1 INSERM, ISPED, Centre INSERM U897-Epidemiologie-Biostatistique
Epidémiologie et Biostatistique [Bordeaux]
2 SISTM - Statistics In System biology and Translational Medicine
Epidémiologie et Biostatistique [Bordeaux], Inria Bordeaux - Sud-Ouest
4 Prévention et prise en charge des traumatismes [Bordeaux]
Université Bordeaux Segalen - Bordeaux 2, Inria - Institut National de Recherche en Informatique et en Automatique, INSERM - Institut National de la Santé et de la Recherche Médicale : U897
Abstract : This paper considers the problem of estimation and variable selection for large high-dimensional data (high number of predictors p and large sample size N, without excluding the possibility that N < p) resulting from an individually matched case-control study. We develop a simple algorithm for the adaptation of the Lasso and related methods to the conditional logistic regression model. Our proposal relies on the simplification of the calculations involved in the likelihood function. Then, the proposed algorithm iteratively solves reweighted Lasso problems using cyclical coordinate descent, computed along a regularization path. This method can handle large problems and deal with sparse features efficiently. We discuss benefits and drawbacks with respect to the existing available implementations. We also illustrate the interest and use of these techniques on a pharmacoepidemiological study of medication use and traffic safety.
Liste complète des métadonnées

Littérature citée [57 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01217312
Contributeur : Anne Jaigu <>
Soumis le : lundi 19 octobre 2015 - 13:02:30
Dernière modification le : jeudi 11 janvier 2018 - 06:26:37
Document(s) archivé(s) le : jeudi 27 avril 2017 - 06:42:53

Fichier

1471-2105-16-S6-S1-1.pdf
Publication financée par une institution

Identifiants

Collections

Citation

Marta Avalos, Hélène Pouyes, Yves Grandvalet, Ludivine Orriols, Emmanuel Lagarde. Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm. BMC Bioinformatics, BioMed Central, 2015, 16 (Suppl 6), pp.S1. 〈http://www.biomedcentral.com/1471-2105/16/S6/S1〉. 〈10.1186/1471-2105-16-S6-S1〉. 〈hal-01217312〉

Partager

Métriques

Consultations de la notice

211

Téléchargements de fichiers

369