2025 AIChE Annual Meeting

(469g) Staying Alive: Maintaining Neural Network Models in the Presence of Systemic Drift

Authors

Tyler Soderstrom, Exxonmobil
Brian A. Korgel, The University of Texas at Austin
Michael Baldea, The University of Texas at Austin
Deep learning is an increasingly widespread method for mapping inputs to outputs for systems where using first-principles models is impossible or impractical due to computational constraints[1], [2], [3]. Currently, deep learning models are typically developed in an offline train-then-deploy paradigm using past data. However, this paradigm does not reflect realistic applications where the underlying data-generating physical systems tend to change/evolve in time. As these systems evolve, the discrepancy between model predictions and the true system behavior increases[4], [5].In these cases, the neural network must be altered or amended to ensure that its predictions remain accurate—a process that we refer to as model maintenance.

Current work using neural networks for process systems engineering rarely considers model maintenance. When discrepancies between the model and the underlying process are considered, the model is usually finetuned—weights are modified by retraining the model on recent measurements, and using the current model parameters as the initial guess. Finetuning, however, is unsuitable for real-time processes such as control as the time required for iterative evaluation across the finetuning dataset would cause delays.

We are interested in maintaining the accuracy of previously trained neural network models online, amid drift in the underlying physical system. We assume that the drift is slow, monotonic, and that the structure of the governing equations of the physical system remain unchanged (and that there are no discontinuities). In contrast to the default method of finetuning, we propose an online method for updating neural network (NN) models as new measurements become available: the Subset Extended Kalman Filter (SEKF).

The SEKF addresses two essential questions in updating NN model parameters: which parameters to update, and how to update them. To the first, point, the gradient of the loss function (which is available via backpropagation) is used to select the NN parameters with the highest impact on the magnitude of the loss function as it is affected by drift. To the second question, we formulate an EKF whose structure is aligned with updating only the subset of parameters selected in the previous step. Applying updates to only a subset of neural network model parameters has two benefits. First, it significantly reduces the computational cost of the update step which increases exponentially with the number of parameters updated. Second, it introduces some stability or permeance within the model, by leaving the majority of the NN model parameters unchanged.

We compare the performance of finetuning and the SEKF in terms of error, computational expense, and manual tuning required to maintain previously trained neural network models of dynamical systems of increasing complexity, up to a plant-wide dynamic model of a fluidized bed catalytic cracker/fractionator unit. We find that across all case studies that the SEKF requires less manual hyperparameter tuning, requires less time to perform an update, and achieves at worst similar prediction error as finetuning, though often models maintained using the SEKF have significantly lower error.

References

[1] G. Zeng, Y. Chen, B. Cui, and S. Yu, “Continual learning of context-dependent processing in neural networks,” Nat Mach Intell, vol. 1, no. 8, pp. 364–372, Aug. 2019, doi: 10.1038/s42256-019-0080-x.

[2] O. Santander, V. Kuppuraj, C. A. Harrison, and M. Baldea, “Deep learning economic model predictive control for refinery operation: A fluid catalytic cracker-fractionator case study,” in 2022 26th International Conference on System Theory, Control and Computing, ICSTCC 2022 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2022, pp. 331–336. doi: 10.1109/ICSTCC55426.2022.9931761.

[3] P. Kumar, J. B. Rawlings, M. J. Wenzel, and M. J. Risbeck, “Grey-box model and neural network disturbance predictor identification for economic MPC in building energy systems,” Energy Build, vol. 286, May 2023, doi: 10.1016/j.enbuild.2023.112936.

[4] G. I. Webb, L. K. Lee, F. Petitjean, and B. Goethals, “Understanding Concept Drift,” Apr. 2017, [Online]. Available: http://arxiv.org/abs/1704.00362

[5] J. Gama, I. Zliobaite, A. Bifet, M. Pechenizkiy, and A. Bouchachia, “A survey on concept drift adaptation,” Apr. 01, 2014, Association for Computing Machinery. doi: 10.1145/2523813.