2014 AIChE Annual Meeting
(508c) On Inverse Problems in Complex Biological Systems: An Integrated Multiple Shooting and Genetic Algorithm Approach
Authors
The inverse problem of parameter estimation and state identification from dynamical data obtained from systems exhibiting nonlinear behavior, using ODE model descriptions is crucial for studying the behavior of complex biological systems. The inverse problem is generally posed as optimization problem so as to fit a suitable model description by using monitored data from the system. Several deterministic, heuristic and metaheuristic methods have been used for these studies (Chou and Voit, 2009; Engl et. al., 2009). Employing deterministic approaches for parameter estimation in nonlinear systems often face the problem of getting stuck in local minima while heuristic/metaheuristic approaches become computationally expensive due to the high number of iterative simulations of the ODE model.
To alleviate some of these problems, hybrid optimization approaches have been proposed (El-Milhoub et. al., 2006; Rodriguez-Fernandez et. al., 2006). In the majority of these algorithms, heuristic global search approaches, such as genetic algorithms, simulated annealing etc. are used along with deterministic gradient based local search algorithms. In addition, several support methods have also been developed that narrow down the search space and ease the computational and mathematical rigor for the original optimization problem (Voit and Almeida, 2004; Kimura et. al., 2005; Jia et. al., 2011; Vilela et. al., 2009). Even so, these hybrid and support methods have limited applicability and there is still a need for the development of algorithms that address the common issues faced in finding solution of inverse problems. These pertain to presence of measurement noise, missing data points and when some process variables are not amenable to measurement.
Recently, we proposed a novel parameter estimation and state identification algorithm, EMSGA, by Embedding (E) a multiple shooting (MS) methodology in the framework of a genetic algorithm (GA) (Sarode et. al., 2014). In the MS component, the total sampling interval of the monitored data is divided into multiple sub-intervals for numerical solution and the data fitting objective is minimized such that consecutive sub-intervals have continuity (Peifer and Timmer, 2007, Ghosh et. al., 2001). This optimization problem can be solved using gradient search methods such as nonlinear constrained Sequential Quadratic Programming (SQP). On convergence it provides the state identification of process variables along with true process parameter values by estimation. Its application offers advantages in handling noisy data, missing data points and recovering unmonitored process variables. However, the application of MS as a stand-alone formalism for complex optimization problems with high nonlinearity, stiffness, large number of parameters, and noisy data is inefficient because of the weak global search capabilities. Thus, in our ESMGA approach, the MS methodology is now embedded in a framework of a heuristic global search method, viz., GA and combining the two alleviates the difficulties. In contrast to standard stand alone GA, where based on the objective score of estimated parameter values, a new generation population is evolved directly, in EMSGA, the MS algorithm is incorporated as a transformer function to obtain even better estimates of the guessed parameters and process variables values before evolving the estimated population by GA. Thus, the local search of MS and global search of GA work in cooperation to estimate process parameter values, noise free process state variables values while simultaneously recovering the unmonitored dynamics of the other dependent variables. To assist in the computational performance of the algorithm, we show how additional transformer functions in the form of different support methods can further augment the overall EMSGA framework in a modular fashion.
We demonstrate the utility of the proposed algorithm in modeling noisy data obtained from batch fermentation for ethanol and glycerol production using Saccharomyces diastaticus LORRE 316 (Wang et. al., 2001). Ability of estimating large number of parameters along with noise-free dynamics of monitored and unmonitored system variables is demonstrated using noisy oscillatory and even chaotic dynamics obtained from Lotka-Volterra models (Voit and Chou, 2010). Canonical models for fermentation pathway in Saccharomyces cerevisiae (Curto et. al., 1995) are used to demonstrate utility of various support methods in the performance enhancement of the algorithm.
Chou, I., & Voit, E. O. (2009). Recent developments in parameter estimation and structure identification of biochemical and genomic systems. Mathematical Biosciences, 219(2), 57-83.
Curto, R., Sorribas, A., & Cascante, M. (1995). Comparative characterization of the fermentation pathway of Saccharomyces cerevisiae using biochemical systems theory and metabolic control analysis: Model definition and nomenclature. Mathematical Biosciences, 130(1), 25-50.
El-Milhoub, T. A., Hopgood, A. A., Nolle, L., & Battersby, A. (2006). Hybrid Genetic Algorithms: A Review. Engineering Letters, 13(3), 124-137.
Engl, H. W., Flamm, C., Kügler, P., Lu, J., Müller, S., & Schuster, P. (2009). Inverse problems in systems biology. Inverse Problems, 25(12), 123014.
Ghosh, A., Ravi Kumar, V., & Kulkarni, B. D. (2001). Parameter estimation in spatially extended systems: The Karhunen-Lóeve and Galerkin multiple shooting approach. Physical Review E, 64(5), 056222_1-056222_8.
Irvine, D. H., & Savageau, M. A. (1990). Efficient solution of nonlinear ordinary differential equations expressed in S-system canonical form. SIAM Journal on Numerical Analysis, 27(3), 704-735.
Jia, G., Stephanopoulos, G. N., & Gunawan, R. (2011). Parameter estimation of kinetic models from metabolic profiles: two-phase dynamic decoupling method. Bioinformatics, 27(14), 1964-1970.
Kimura, S., Ide, K., Kashihara, A., Kano, M., Hatakeyama, M., Masui, R., Nakagawa, N., Yokoyama, S., Kuramitsu, S., & Konagaya, A. (2005). Inference of S-system models of genetic networks using a cooperative coevolutionary algorithm. Bioinformatics, 21, 1154-63.
Peifer, M., & Timmer, J. (2007). Parameter estimation in ordinary differential equations for biochemical processes using the method of multiple shooting. Systems Biology, IET, 1(2), 78-88.
Rodriguez-Fernandez, M., Egea, J. A., & Banga, J. R. (2006). Novel metaheuristic for parameter estimation in nonlinear dynamic biological systems. BMC Bioinformatics, 7(1), 483.
Sarode, K. D., Ravi Kumar, V., Kulkarni, B. D. (2014) Embedded Multiple Shooting Methodology in a Genetic Algorithm Framework for Parameter Estimation and State Identification of Complex Systems. Industrial & Engineering Chemistry Research Submitted.
Savageau, M. A., & Voit, E. O. (1987). Recasting nonlinear differential equations as S-systems: a canonical nonlinear form. Mathematical Biosciences, 87(1), 83-115.
Vilela, M., Vinga, S., Maia, M. A., Voit, E. O., & Almeida, J. S. (2009). Identification of neutral biochemical network models from time series data. BMC Systems Biology, 3(1), 47.
Voit, E. O. (2013). Biochemical systems theory: a review. ISRN Biomathematics, 2013.
Voit, E. O., & Almeida, J. (2004). Decoupling dynamical systems for pathway identification from metabolic profiles. Bioinformatics, 20(11), 1670-1681.
Voit, E., & Chou, I. C. (2010). Parameter estimation in canonical biological systems models. International Journal of Systems and Synthetic Biology, 1, 1-19.
Wang, F. S., Su, T. L., & Jang, H. J. (2001). Hybrid differential evolution for problems of kinetic parameter estimation and dynamic optimization of an ethanol fermentation process. Industrial & Engineering Chemistry Research, 40(13), 2876-2885.