2025 AIChE Annual Meeting

(259f) Physics-Informed Model-Based Policy Optimization of Koopman Economic NMPC Policies

Checkout Do you already own this? Log in to access this content.

Pricing

Individuals

AIChE Pro Members	150.00
AIChE Emeritus Members	105.00
AIChE Graduate Student Members	Free
AIChE Undergraduate Student Members	Free
AIChE Explorer Members	225.00
Non-Members	225.00

Authors

Alexander Mitsos - Presenter, RWTH Aachen University

Daniel Mayfrank

Mehmet Velioglu

Manuel Dahmen, FZ Jülich

Data-driven dynamic models present a promising avenue for rendering (economic) nonlinear model predictive control ((e)NMPC) tractable [1] for complex processes where (i) no mechanistic model is available, or (ii) a mechanistic model is available but cannot be used as part of a real-time capable (e)NMPC policy [2]. While system identification (SI) is the most common approach to training data-driven dynamic models, it narrowly focuses on maximizing average prediction accuracy. Reinforcement learning (RL) offers an alternative or complementary method to SI: It can be used to tune (e)NMPC policies for optimal performance in a specific control task by optimizing the dynamic model [3,4,5] or parameters in the policy’s objective function or constraints [5,6], e.g., state bounds. However, standard RL algorithms are notoriously sample-inefficient, hindering their usage when the number of interactions with the control system for learning purposes is limited [7].

We present a novel approach [8] for sample-efficient RL-based (e)NMPC policy learning in process control by combining a model-based RL algorithm [9] with our previously published method [4] that turns Koopman (e)NMPC policies into automatically differentiable policies. When applied to an eNMPC case study of a continuous stirred-tank reactor (CSTR) model from the literature [10], the approach outperforms benchmark methods, i.e., data-driven eNMPC policies using models based on system identification without further RL tuning of the resulting policy, and neural network controllers trained with model-based RL, by achieving superior control performance and higher sample efficiency [8]. Furthermore, utilizing partial prior knowledge about the system dynamics via physics-informed learning [11,12] further increases sample efficiency [8].

Our work integrates modern methods that increase the sample efficiency of dynamic model learning [10,11] and RL [9] with learning task-optimal data-driven (e)NMPC policies. Thus, it is a step toward making RL-based learning of predictive controllers feasible for complex real-world process control problems where no simulator of the environment is available a priori and learning by interacting with the process is expensive.

[1] Tang, W., & Daoutidis, P. (2022). Data-driven control: Overview and perspectives. In 2022 American Control Conference (ACC), 1048-1064.

[2] McBride, K., & Sundmacher, K. (2019). Overview of surrogate modeling in chemical process engineering. Chemie Ingenieur Technik, 91(3), 228-239.

[3] Chen, B., Cai, Z., & Bergés, M. (2019). GNU-RL: A precocial reinforcement learning solution for building HVAC control using a differentiable MPC policy. In Proceedings of the 6th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, 316-325.

[4] Mayfrank, D., Mitsos, A., & Dahmen, M. (2024). End-to-end reinforcement learning of Koopman models for economic nonlinear model predictive control. Computers & Chemical Engineering, 190, 108824.

[5] Gros, S., & Zanon, M. (2019). Data-driven economic NMPC using reinforcement learning. IEEE Transactions on Automatic Control, 65(2), 636-648.

[6] Brandner, D., Talis, T., Esche, E., Repke, J. U., & Lucia, S. (2023). Reinforcement learning combined with model predictive control to optimally operate a flash separation unit. In Computer Aided Chemical Engineering (52), 595-600.

[7] Gopaluni, R. B., Tulsyan, A., Chachuat, B., Huang, B., Lee, J. M., Amjad, F., Damarla, S. K., Kim, J. W., and Lawrence, N. P. (2020). Modern machine learning tools for monitoring and control of industrial processes: A survey. IFAC-PapersOnLine, 53(2), 218–229.

[8] Mayfrank, D., Velioglu, M., Mitsos, A., & Dahmen, M. (2025). Sample-Efficient Reinforcement Learning of Koopman eNMPC. arXiv preprint arXiv:2503.18787.

[9] Janner, M., Fu, J., Zhang, M., and Levine, S. (2019). When to trust your model: Model-based policy optimization. Advances in Neural Information Processing Systems, 32, 12498–12509.

[10] Flores-Tlacuahuac, A., & Grossmann, I. E. (2006). Simultaneous cyclic scheduling and control of a multiproduct CSTR. Industrial & Engineering Chemistry Research, 45(20), 6698-6712.

[11] Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378, 686-707.

[12] Antonelo, E. A., Camponogara, E., Seman, L. O., Jordanou, J. P., de Souza, E. R., & Hübner, J. F. (2024). Physics-informed neural nets for control of dynamical systems. Neurocomputing, 579, 127419.

Breadcrumb

2025 AIChE Annual Meeting

(259f) Physics-Informed Model-Based Policy Optimization of Koopman Economic NMPC Policies

Authors