Handling the heterogeneity of genomic and metabolic networks data within flexible workflows with the PADMet toolbox

A main challenge of the era of fast and massive genome sequencing is to transform sequences into biological knowledge. The reconstruction of metabolic networks that include all biochemical reactions of a cell is a way to understand physiology interactions from genomic data. In 2010, Thiele and Palsson described a general protocol enabling the reconstruction of high-quality metabolic networks. Since then several approaches have been implemented for this purpose. They all rely mainly on drafting a first metabolic network from genome annotations and orthology information followed by a gap-filling step. More precisely, in the case of exotic species the lack of good annotations and poor biological information result in incomplete networks. Reference databases of metabolic reactions guide the filling process in order to check whether adding reactions to a network allows compounds of interest to be produced from a given growth media. As a final objective, as soon as the network is considered to be complete enough, functional studies are undergone, often relying on the constraint-based paradigm derived from the Flux Balance Analysis (FBA) framework (Orth et al., 2010). The high diversity of input files and tools required to run any metabolic networks reconstruction protocol represents an important drawback. In addition, most approaches require reference metabolic networks of a template organism. Dictionaries mapping the reference metabolic databases to the gene identifiers corresponding to the studied organism may be required. As a main issue, it appears very difficult to ensure that input files agree among them. Such a heterogeneity produces loss of information during the use of the protocols and generates uncertainty in the final metabolic model. Here we introduce the PADMet-toolbox which allows conciliating genomic and metabolic network information. The toolbox centralizes all this information in a new graph-based format: PADMet (PortAble Database for Metabolism) and provides methods to import, update and export information. For the sake of illustration, the toolbox was used to create a workflow, named AuReMe, aiming to produce high-quality genome-scale metabolic networks and eventually input files to feed most platforms involved in metabolic network analyses. We applied this approach to two exotic organisms and our results evidenced the need of combining approaches and reconciling information to obtain a functional metabolic network to produce biomass.

Mots clés

scale metabolic networks reconstruction workflow exotic species data homogenisation genome

Domaines

Bio-informatique [q-bio.QM] Informatique [cs]

Fichier principal

soumission_Jobim_106437.pdf (126.06 Ko)

Chevallier_jeudi30.pdf (11.66 Mo)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

Origine : Fichiers produits par l'(les) auteur(s)

Marie Chevallier : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01377844

Soumis le : vendredi 7 octobre 2016-16:56:37

Dernière modification le : mercredi 3 avril 2024-11:16:05

Archivage à long terme le : vendredi 3 février 2017-23:38:46

Dates et versions

hal-01377844 , version 1 (07-10-2016)

Identifiants

HAL Id : hal-01377844 , version 1

Citer

Marie Chevallier, Méziane Aite, Jeanne Got, Guillaume Collet, Nicolas Loira, et al.. Handling the heterogeneity of genomic and metabolic networks data within flexible workflows with the PADMet toolbox. Jobim 2016 : 17ème Journées Ouvertes en Biologie, Informatique et Mathématiques, Jun 2016, Lyon, France. ⟨hal-01377844⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA INRIA-CHILE CENTRALESUPELEC IRISA-D7 INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

315 Consultations

150 Téléchargements