2021 Annual Meeting
(346o) Bayesian Machine Learning of Dynamic Systems with Unknown Noise Characteristics
In the work of Wilson and Sahinidis [1], a regression and classification model learning methodology that builds surrogate steady-state models from a given dataset using a minimal set of sample points was developed. While Ferguson et al., [4] investigated a class of information criteria that penalize model fit based on the degree of dependency among model parameters, application of the proposed criterion was verified only on steady state simulation experiments with mostly single measurements in absence of noise/uncertainties in data. However, industrial measurements are often corrupted with noise with unknown characteristics. Furthermore, the user may have some belief about the model parameters that he/she will like to incorporate as prior distribution of the model parameters. The focus of this work is on developing Bayesian machine learning technique for dynamical systems when measurement noise characteristics are unknown and there is a priori user knowledge about the model parameters.
Inferencing based on Bayesâ Rule is a widely used method for system identification and parameter estimation especially when the measurements and model parameters are uncertain. The method uses a prior probability distribution together with the likelihood to obtain the maximum a posteriori probability (MAP) of the parameters [3]. This method has been effectively used in the expectation maximization (EM) algorithm framework for structural analysis of plants in order to estimate connectivity strengths among various sub-processes [5]. The focus of that work was on identifying input-state-output dynamic bilinear models for the process systems. This work enhances our previous work [5] by extending it to more general classes of nonlinear systems by considering a large family of nonlinear basis functions using measurement data that are correlated and noisy with unknown characteristics. Optimal model selection is done based on the information criterion that penalizes overfitting.
The proposed algorithm is applied to the data generated from a computer model of the Van de Vusse reactor with simulated noisy data. For this example, process noise characteristics are unknown and it was observed that the EM algorithm not only leads to low estimation error but also accurately estimates the noise characteristics. The algorithm is also applied to a highly complicated superheater system as part of a power plant where the dynamic data with unknown noise characteristics are available from an industry.
References
[1] Z. T. Wilson and N. V. Sahinidis, âThe ALAMO approach to machine learning,â Comput. Chem. Eng., vol. 106, pp. 785â795, 2017, doi: 10.1016/j.compchemeng.2017.02.010.
[2] X. Wu, X. Zhu, G. Q. Wu, and W. Ding, âData mining with big data,â IEEE Trans. Knowl. Data Eng., vol. 26, no. 1, pp. 97â107, 2014, doi: 10.1109/TKDE.2013.109.
[3] J. H. A. Guillaume et al., âIntroductory overview of identifiability analysis: A guide to evaluating whether you have the right type of data for your modeling purpose,â Environ. Model. Softw., vol. 119, no. April, pp. 418â432, 2019, doi: 10.1016/j.envsoft.2019.07.007.
[4] J. M. Ferguson, M. L. Taper, R. Zenil-Ferguson, M. Jasieniuk, and B. D. Maxwell, âIncorporating Parameter Estimability Into Model Selection,â Front. Ecol. Evol., vol. 7, no. November, pp. 1â15, 2019, doi: 10.3389/fevo.2019.00427.
[5] T. Bankole and D. Bhattacharyya, âExploiting connectivity structures for decomposing process plants,â J. Process Control, vol. 71, pp. 116â129, 2018, doi: 10.1016/j.jprocont.2018.09.002.