Model predictive control (MPC) is widely used in industry due to its optimization-based approach, prediction horizon, and the ability to incorporate output and safety constraints for practical control of multivariable systems. For MPC to select an optimal control action, it requires an accurate system model. There are two ways to develop such a model: (1) first-principle modeling and (2) data-driven modeling. First-principle modeling is challenging to maintain and achieve, while data-driven techniques require rich and informative data. On the other hand, nonlinear MPC (NMPC) is a nonconvex and challenging optimization problem, making it difficult to provide stable, feasible, and real-time control actions in a process plant—thus limiting its practical application. Although linear MPC is more common in practical setups, it may not perform optimally when facing faults, disturbances, and complex, nonlinear systems. Adaptive MPC can address these challenges by continuously recalibrating the linear state-space model during operation [1]; however, this constant recalibration is both challenging and costly for real-world applications.
Reinforcement learning (RL) has emerged as an alternative approach in many control applications for online optimal control, especially in adaptive tuning and adjusting model parameters for MPC [2]. The parameters within the MPC optimization formulation, such as prediction horizon, control horizon, and system matrices of the linear state-space model, are crucial for achieving an offset-free and fast-response controller, particularly when facing faults, disturbances, and setpoint changes. In [3], an RL-based strategy is proposed to vary and adjust the prediction and control horizons in response to changes in operating conditions. Building upon this, an RL agent that can efficiently adjust system matrices and tune model parameters within a linear industrial MPC is critical to ensure reliable closed-loop performance in real-time applications.
Motivated by these challenges, an RL-based algorithm is developed for the online tuning of a linear MPC built from a model identified via step tests (representative industrial MPC). The RL agent adjusts the model parameters as plant operating conditions change. A twin-delayed deep deterministic policy gradient (TD3) algorithm is used to search targeted regions of the parameter space for fine-tuning. Two distinct case studies are presented: the ethylene splitter (C2 splitter) within ASPEN Dynamics software and a polymerization reactor [4]. Compare to previous works, our approach is applied to large-scale models and leverages RL not only to adapt model parameters but also to tune MPC parameters, providing a comprehensive solution. The performance of the RL fine-tuned MPC under disturbances and changes in operating conditions is compared with that of a nominal MPC and a nominal MPC with re-identified model parameters (adaptive MPC). This comparison demonstrates the capability of the proposed approach.
References
[1] Fukushima, H., Kim, T.H. and Sugie, T., 2007. Adaptive model predictive control for a class of constrained linear systems based on the comparison model. Automatica, 43(2), pp.301-308.
[2] Bøhn, E., Gros, S., Moe, S. and Johansen, T.A., 2023. Optimization of the model predictive control meta-parameters through reinforcement learning. Engineering Applications of Artificial Intelligence, 123, p.106211.
[3] Hedrick, E., Hedrick, K., Bhattacharyya, D., Zitney, S.E. and Omell, B., 2022. Reinforcement learning for online adaptation of model predictive controllers: Application to a selective catalytic reduction unit. Computers \& Chemical Engineering, 160, p.107727.
[4] Bustos, G.A., Ferramosca, A., Godoy, J.L. and González, A.H., 2016. Application of model predictive control suitable for closed-loop re-identification to a polymerization reactor. Journal of Process Control, 44, pp.1-13.