Metabolic Engineering X
A Bayesian Design of Experiments for Ensemble Modelling of Metabolic Networks
Model-based discovery in biology is an iterative process that integrates wet-lab experiments, in silicoanalysis, and optimisation. Most modelling studies in the literature involve the creation of an accurate (high fidelity) model of the system, defined by a set of equations and parameter values that describe important or interesting behaviour of the system. There are many challenges in performing the iterative model-based discovery. The bottlenecking step is typically encountered during the estimation of unknown kinetic parameters from experimental data. This has led to the development of a large number of parameter estimation techniques [1].
The estimation of kinetic parameters by fitting model simulations to biological data is usually ill posed. There often does not exist a single (best-fit) solution to the data fitting problem, but rather one can find many parameter combinations, i.e. an ensemble of parameters, that can fit the data statistically equally well. Here, the parameter ensemble represents the uncertainty of the model parameters. We have recently introduced an algorithm for constructing such an ensemble from a given dataset of time-series metabolite concentrations for kinetic models of metabolic networks [2]. The ensemble corresponds to the parameter subspace defined by the contour of the likelihood ratio for a specified statistical significance level. In practical applications, it is often desired and necessary to reduce the size of the parameter ensemble by performing additional experiments and gathering new data.
The goal of the present work is to design the experiments that would lead to a significant reduction in the ensemble size. The modern technique of model-based experimental design aims at obtaining the most informative data from an experiment in order to validate the predictions of a model (model outputs). For this purpose, the experimental conditions, including for example the sampling times and the time-varying controls or inputs of the system, are optimized to obtain the maximum information from the data. Many designs of experiments have previously been developed based on the Fisher information matrix optimality criterion without considering the uncertainty associated to the model parameters. Here, we have used a Bayesian approach to optimize the experimental conditions using the parameter ensemble as a priori information.
The design of experiments is based on approximate Bayesian computation design (ABCD) [3]. Briefly, the procedure consists in four steps: (1) select a experimental design d, (2) draw a sample p from the parameter ensemble (prior distribution) P(p), (3) for every parameter sample, simulate the model and generate a sample of noisy data y* (assuming the data noise statistic is known), thus producing a sample from the joint distribution P(y,p|d), (4) for every y in the marginal distribution P(y|d), gather p for which y* is within a neighbourhood of y, (5) compute design criterion based on the neighbourhood p, and (6) repeat these steps while optimizing the design criterion. We evaluated several design criteria including maximization of the mode of posterior distribution and Fisher information. We demonstrated the utility of the design on a few case studies of metabolic networks: a generic branched pathway [4] and the trehalose pathway in Saccharomyces cerevisiae [5].
References
[1] I. C. Chou and E. O. Voit. Recent developments in parameter estimation and structure identification of biochemical and genomic systems. Math Biosci. 219(2):57-83, 2009.
[2] G. Jia, G. Stephanopoulos, and R. Gunawan. Ensemble kinetic modeling of metabolic networks from dynamic metabolic profiles. Metabolites 2(4):891-912, 2012.
[3] M. Hainy, W. Müller, and H. Wynn. Approximate Bayesian Computation Design (ABCD), an Introduction. In: D. Ucinski and A. C. Atkinson; Patan, Maciej (Eds.): mODa 10 – Advances in Model-Oriented Design and Analysis, Proceedings of the 10th International Workshop in Model-Oriented Design and Analysis Held in Łagów Lubuski, Poland, June 10–14, 2013, Series Contributions to Statistics, Springer International Publishing, Page(s) 135-143, Springer, 2013.
[4] E.O. Voit and J. Almeida. Decoupling dynamical systems for pathway identification from metabolic profiles. Bioinformatics 20(11):1670-1681, 2004.
[5] I. C. Chou and E. O. Voit. Estimation of dynamic flux profiles from metabolic time series data. BMC Syst. Biol. 6:84-106, 2012.