https://hal.inria.fr/hal-02173721Martinelli, JulienJulienMartinelliModèles de Cellules Souches Malignes et Thérapeutiques - UP11 - Université Paris-Sud - Paris 11 - INSERM - Institut National de la Santé et de la Recherche MédicaleLifeware - Computational systems biology and optimization - Inria Saclay - Ile de France - Inria - Institut National de Recherche en Informatique et en AutomatiqueGrignard, JeremyJeremyGrignardLifeware - Computational systems biology and optimization - Inria Saclay - Ile de France - Inria - Institut National de Recherche en Informatique et en AutomatiqueIRS - Institut de Recherches SERVIERSoliman, SylvainSylvainSolimanLifeware - Computational systems biology and optimization - Inria Saclay - Ile de France - Inria - Institut National de Recherche en Informatique et en AutomatiqueFages, FrançoisFrançoisFagesLifeware - Computational systems biology and optimization - Inria Saclay - Ile de France - Inria - Institut National de Recherche en Informatique et en AutomatiqueOn Inferring Reactions from Data Time Series by a Statistical Learning Greedy HeuristicsHAL CCSD2019[INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM]Fages, FrançoisBortolussi, LucaSanguinetti, Guido2019-07-04 16:21:272023-03-15 08:56:172019-07-04 16:23:36enConference papersapplication/pdf1With the automation of biological experiments and the increase of quality of single cell data that can now be obtained by phospho-proteomic and time lapse videomicroscopy, automating the building of mechanistic models from these data time series becomes conceivable and a necessity for many new applications. While learning numerical parameters to fit a given model structure to observed data is now a quite well understood subject, learning the structure of the model is a more challenging problem that previous attempts failed to solve without relying quite heavily on prior knowledge about that structure. In this paper, we consider mechanistic models based on chemical reaction networks (CRN) with their continuous dynamics based on ordinary differential equations, and finite time series about the time evolution of concentration of molecular species for a given time horizon and a finite set of perturbed initial conditions. We present a greedy heuristics unsupervised statistical learning algorithm to infer reactions with a time complexity for inferring one reaction in O(t.n 2) where n is the number of species and t the number of observed transitions in the traces. We evaluate this algorithm both on simulated data from hidden CRNs, and on real videomicroscopy single cell data about the circadian clock and cell cycle progression of NIH3T3 embryonic fibroblasts. In all cases, our algorithm is able to infer meaningful reactions, though generally not a complete set for instance in presence of multiple time scales or highly variable traces.