Models and methods in genome wide association studies

Luciano Porretta 1, 2
2 INOCS - Integrated Optimization with Complex Structure
ULB - Université Libre de Bruxelles [Bruxelles], Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Abstract : The interdisciplinary field of systems biology has evolved rapidly over the last few years. Different disciplines have contributed to the development of both its experimental and theoretical branches. Although computational biology has been an increasing activity in computer science for more than a two decades, it has been only in the past few years that optimization models have been increasingly developed and analyzed by researchers whose primary background is Operations Research ( OR ). This dissertation aims at contributing to the field of computational biology by applying mathematical programming to certain problems in molecular biology. Specifically, we address three problems in the domain of Genome-Wide Association Studies: (i) the Pure Parsimony Haplotyping under Uncertain Data Problem that consists in finding the minimum number of haplotypes necessary to explain a given set of genotypes containing possible reading errors; (ii) the Parsimonious Loss of Heterozygosity Problem that consists of partitioning suspected polymorphisms from a set of individuals into a minimum number of deletion areas; (iii) and the Multiple Individuals Polymorphic ALU Insertions Recognition Problem that consists of finding the set of locations in the genome where ALU sequences are inserted in some individual(s). All three problems are NP-hard combinatorial optimization problems. Therefore, we analyse their combinatorial structure and we propose an exact approach to solution for each of them. The proposed models are efficient, accurate, compact, polynomial-sized and usable in all those cases for which the parsimony criterion is well suited for estimation.
Document type :
Complete list of metadatas
Contributor : Bernard Fortz <>
Submitted on : Monday, December 10, 2018 - 9:47:05 AM
Last modification on : Friday, May 17, 2019 - 11:40:07 AM
Long-term archiving on : Monday, March 11, 2019 - 12:42:20 PM


Files produced by the author(s)

Publié avec l'accord du doctorant.


  • HAL Id : tel-01944087, version 1



Luciano Porretta. Models and methods in genome wide association studies. Operations Research [cs.RO]. Université libre de Bruxelles, 2018. English. ⟨tel-01944087⟩



Record views


Files downloads