2025 AIChE Annual Meeting

(259e) State-Space Neural Network Architecture for Nonlinear Dynamical System Modeling

Authors

Prashant Mhaskar, McMaster University
Data-driven modeling and control of dynamical systems is essential in chemical process engineering. Because many real industrial systems are both nonlinear and complex, an effective modeling strategy coupled with a suitable model-based control design becomes crucial. Recurrent neural networks (RNNs) are often employed for time-series modeling in these settings, yet standard RNN structures can struggle to accurately capture nonlinear process behavior without additional modifications (e.g., LSTMs or GRUs). Moreover, these enhancements often demand more data to fit the model adequately. Although RNNs bear some resemblance to state-space models, they typically exhibit two key inconsistencies: they use fixed or arbitrary initial hidden states rather than learnable representations of a system’s initial condition, and their internal model structure does not correctly mirror the classical state-space formulation – that is used for explaining any kind of dynamical system. Moreover, standard RNNs impose recurrence in multiple layers instead of confining it to the state variable alone to handle non-linearities. This practice can lead to unnecessarily large, over-parameterized models, thus affecting accuracy.

Here, we introduce a state-space neural network (SSNN) with a state space model structure, that resolves these shortcomings. The SSNN treats initial states for each trajectory as trainable parameters, enabling the model to learn a more faithful representation of each sequence’s true starting condition and thereby avoiding issues associated with arbitrary initialization. Further, instead of layering multiple recurrent units, the SSNN adopts a classical state-space perspective, relying on two feedforward neural networks to describe both the state transition and the output equations. This approach ensures that recurrence exists solely through the evolution of a finite-dimensional latent state, eliminating the need to stack recurrent layers to handle non-linearity. There have been SSNN formulations in the past but they do not address the state initialization nor do they eliminate the redundancy of stacking recurrent layers.

To demonstrate its effectiveness, we evaluate the SSNN on a nonlinear batch stirred tank reactor simulation. The data includes multiple batch runs with different initial reactant concentrations and operating strategies, testing the model’s ability to generalize across diverse conditions. First, we measure the predictive performance of the SSNN and compare it to standard RNN baselines by examining mean squared error (MSE) on concentration and temperature profiles. We then integrate each model into a Model Predictive Controller (MPC) and compare their closed-loop performance through the resulting optimal input trajectories and objective function values. Across these tests, the SSNN achieves lower MSE than standard RNNs and maintains a much less parameter count. These findings underscore the promise of the SSNN for accurate, parsimonious, and interpretable data-driven modeling and control of nonlinear chemical processes.

[1] Lanzetti, N.; Lian, Y. Z.; Cortinovis, A.; Dominguez, L.; Mercangöz, M.; Jones, C. Recurrent Neural Network based MPC for Process Industries. 2019 18th European Control Conference (ECC). 2019; pp 1005–1010.

[2] Gu, A.; Goel, K.; Ré, C. Efficiently Modeling Long Sequences with Structured State Spaces. CoRR 2021, abs/2111.00396 .

[3] Alhajeri, M. S.; Luo, J.; Wu, Z.; Albalawi, F.; Christofides, P. D. Process structure based recurrent neural network modeling for predictive control: A comparative study. Chemical Engineering Research and Design 2022, 179, 77–89.