2018 AIChE Annual Meeting
(728e) Comparison of Surrogate Modeling Techniques for Surrogate-Based Optimization
Construction of a surrogate model is comprised of three steps: (1) selection of the sample points, (2) optimization or âtrainingâ of the model parameters, and (3) evaluation of the accuracy of the surrogate model (Wang et al., 2014). Although several machine learning and regression techniques have been developed for surrogate model construction, there has been little work done on how to best select the appropriate model for a particular application for both surrogate modeling and surrogate based optimization. The majority of the studies comparing surrogate model performance only compare a few models on a limited number of functions or complex models. Davis et al. (2017) investigated a more extensive selection of surrogate modeling techniques for 35 challenge functions and concluded that Artificial Neural Networks, Automated Learning of Algebraic Models using Optimization, and Extreme Learning Machines yielded the most accurate predictions for the challenge functions tested. This work aims to build upon that study and to further address the knowledge gap by comparing the ability of eight different surrogate modeling techniques to both learn and accurately model the responses of a set of challenge functions and to locate the extrema of these functions using surrogate based optimization. The surrogate-modeling techniques considered include Artificial Neural Networks (ANN), Automated Learning of Algebraic Models using Optimization (ALAMO), Radial Basis Networks (RBN), Extreme Learning Machines (ELM), Gaussian Progress Regression (GPR), Random Forests (RF), Support Vector Regression (SVR), and Multivariate Adaptive Regression Splines (MARS). These techniques are used to construct surrogate models for the 47 optimization challenge functions from the Virtual Library of Simulation Experiments (Surjanovic, 2013). The effects of the challenge function characteristics, including function shape and number of inputs, and sampling methods on the surrogate model performance are evaluated. The sampling methods studied are Sobol sequence sampling and Latin Hypercube sampling (LHS). Four performance measures are used to evaluate the accuracy of the surrogate models: root mean squared error (RMSE), maximum percent error (MPE), the R-squared value, and the Akaike Information Criteria (AIC). The modelsâ ability to locate the extrema of the functions are evaluated by calculating the distance between the extreme point(s) estimated by the model and the actual function extrema. The results provide guidance on selecting which surrogate modeling technique to use based on the specifics and characteristics of the function or data set being modeled.
References:
Davis, S., Cremaschi, S., Eden, M., 2017, âEfficient Surrogate Model Development: Optimum Model Form Based on Input Function Characteristicsâ,Computer Aided Chemical Enginering 70.1, 457-462.
Grimstad, B., Foss, B., Heddle, R., Woodman, M., 2016, âGlobal optimization of multiphase flow networks using spline surrogate modelsâ, Computers and Chemical Engineering 84.1, 237-254.
Han, ZH. and Zhang, KH., 2012,âSurrogate-Based Optimizationâ, Real-World Applications of Genetic Algorithms. InTech. 343-362.
Icten, E., Nagy, Z., Reklaits, G, 2015, âProcess control of a dropwise additive manufacturing system for pharmaceuticals using polynomial chaos expansion based surrogate modelâ, 2015, Computers and Chemical Engineering 83.1, 221-231.
Surjanovic, S., Bingham, D., 2013, âVirtual Library of Simulation Experiments: Test Functions and Datasetsâ, http://www.sfu.ca/~ssurjano.
Wang, C., Duan, Q., Gong, W., Ye, A., Di, Z., Miao, C., 2014, âAn evaluation of adaptive surrogate modeling based optimization with two benchmark problemsâ, Environmental Modelling and Software 60.1, 167-179.