2024 AIChE Annual Meeting

(674d) Hybrid Gaussian Radial Basis Neural Networks (GRAB-NN): Development, Training, and Applications to Modeling Nonlinear Dynamic Noisy Chemical Processes

Authors

Mukherjee, A. - Presenter, West Virginia University
Bhattacharyya, D., West Virginia University
Developing accurate first-principles models for complex nonlinear stochastic dynamic systems can be time consuming, computationally expensive, and may be infeasible for certain systems due to lack of knowledge. It is also challenging to adapt first-principles models for time-varying probabilistic process systems. Data-driven or black-box models are relatively easier to develop, simulate and adapt online. However, it can be difficult to accurately represent complex, nonlinear stochastic dynamical system using conventional data-driven models such as artificial neural networks (ANN), expert systems, and fuzzy logic1. Although multilayered feedforward ANNs act as universal approximators for many classification problems2, existing literature has shown that Gaussian radial basis functions (RBFs) have outperformed conventional ANNs for function approximation problems in terms of robustness, fault tolerance, and computational expense for noisy time series data using finite number of hidden units3. However, the performance of RBFs become highly sensitive to the number of parameters estimated during optimal model synthesis as well as the placement of RBF centers4. If number of RBF centers is large, it would lead to large number of model parameters (weights and biases) resulting in potential overfitting during training and eventually show inferior predictive capabilities, especially in presence of uncertainties in training data. This work proposes the development of novel architectures and training algorithms for single hidden-layered hybrid Gaussian radial basis neural networks (GRAB-NN) combining Gaussian and sigmoid hidden nodes for modeling nonlinear transient chemical process systems.

The development of standalone optimal RBF networks mainly involve two steps- selection of coordinates of RBF centers / widths, followed by estimation of optimal weights in the output layer. The predictive performance of RBF networks critically relies on the number of RBF centers/widths and their respective coordinates5. While typical algorithms for locating the RBF centers include fuzzy clustering, particle swarm optimization, and k-means or k-nearest neighbor based clustering6, such metaheuristic approaches may suffer from the curse of dimensionality for modeling higher order systems with larger input space and can be highly sensitive to noisy data during optimal model synthesis and parameter estimation. This work develops a recursive training algorithm where the number and coordinates of RBF centers/widths are selected by minimizing the corrected Akaike Information Criteria7,8 (AICc). In this work efficient sequential MINLP (mixed integer nonlinear programming) algorithms such as bidirectional branch and bound has been developed to determine the optimal number of RBF centers. Two types of distributions are considered for the coordinates of RBF centers, namely deterministic (i.e., assigning constant values to the centers) and stochastic (i.e., sampling the centers from Gaussian distributions having the same mean and variances as the model inputs). The selection of optimal RBF centers / widths is followed by estimation of optimal weights by orthogonal least squares (OLS) as well as Adam stochastic gradient descent (SGD) with mini-batch approaches, thus providing flexibility to achieve the trade-off between prediction accuracy and computational expense while modeling highly nonlinear noisy dynamic data.

The proposed structures of standalone RBF networks have been coupled with conventional feedforward ANNs resulting in the hybrid GRAB-NN architecture, where the single hidden layer consists of a combination of activation functions in the hidden nodes such as the Gaussian function (referred to as the RBF nodes) and the standard logistic sigmoid function (referred to as the ANN nodes). It has been observed that the hybrid GRAB-NN models perform significantly superior as compared to the standalone RBF and ANN models for the same / similar number of model parameters. Furthermore, algorithmic capabilities have been developed to accommodate linear constraints during estimation of optimal weights for the data-driven models through the constrained least squares (CLS) algorithm. For instance, the mass conservation constraints for a chemical process system can be expressed as elemental atom balance equations in terms of the input and output boundary conditions and imposed as linear equality constraints during model training, thus leading to the development of mass-constrained optimal GRAB-NN models. It is worth mentioning that the simultaneous estimation of model parameters (including optimal coordinates of centers / widths as additional parameters) for such hybrid GRAB-NN models in a monolithic approach can induce excessive computational expense. Therefore, this work also develops novel sequential decomposition-based training algorithms which provide the flexibility of estimating optimal parameters for each of the sublayers (i.e., the model parameters associated with the RBF nodes, ANN nodes, and the weights in the output layer) separately and independently by exploiting the model structure, while solving an outer layer optimization to ensure the convergence of the overall hybrid network. The proposed network structures and training algorithms in presence / absence of constraints have been applied to modeling different nonlinear dynamic chemical processes, including the widely used Van de Vusse reactor and an industrial steam superheater system at a partner power plant using actual measurements from plant historian. The proposed structure and algorithm exhibit considerably superior results compared to the state-of-the art when evaluated for large-scale complex nonlinear dynamic noisy process systems.

References

  1. Venkatasubramanian, V. The promise of artificial intelligence in chemical engineering: Is it here, finally? AIChE J. 65, 466–478 (2019).
  2. Boniecki, P., Zaborowicz, M. & Sujak, A. Comparison of MLP and RBF neural models on the example graphical classification. in Thirteenth International Conference on Digital Image Processing (ICDIP 2021) (eds. Jiang, X. & Fujita, H.) 40 (SPIE, 2021). doi:10.1117/12.2600796.
  3. Motahari-Nezhad, M. & Jafari, S. M. Comparison of MLP and RBF neural networks for bearing remaining useful life prediction based on acoustic emission. Proc. Inst. Mech. Eng. Part J J. Eng. Tribol. 237, 129–148 (2023).
  4. Panchapakesan, C., Palaniswami, M., Ralph, D. & Manzie, C. Effects of moving the centers in an RBF network. IEEE Trans. Neural Networks 13, 1299–1307 (2002).
  5. Du, D., Li, K. & Fei, M. A fast multi-output RBF neural network construction method. Neurocomputing 73, 2196–2202 (2010).
  6. Montazer, G. A., Giveki, D., Karami, M. & Rastegar, H. Radial basis function neural networks: A review. Comput. Rev. J 1, 52–74 (2018).
  7. Ferguson, J. M., Taper, M. L., Zenil-Ferguson, R., Jasieniuk, M. & Maxwell, B. D. Incorporating Parameter Estimability Into Model Selection. Front. Ecol. Evol. 7, 1–15 (2019).
  8. Mukherjee, A. & Bhattacharyya, D. Hybrid Series/Parallel All-Nonlinear Dynamic-Static Neural Networks: Development, Training, and Application to Chemical Processes. Ind. Eng. Chem. Res. 62, 3221–3237 (2023).