Chemical engineering processes, in order to meet market demands, have always maintained high-quality products, and their performance must be optimized for both quality and economic purposes. In addition, these processes are prone to faults that can cause safety issues and lower product quality, making robust control essential. One common approach for control design is the use of Proportional-Integral-Derivative (PID) controllers. While PID controllers are well-suited for single-input single-output (SISO) systems, they are inefficient when applied to complex, nonlinear, and multivariable systems. To address these challenges, model predictive control (MPC) has been suggested. This control design can be used for multi-input, multi-output (MIMO) systems, and, unlike classical PID controllers, output and safety constraints can be incorporated into the method [1].
The MPC algorithm requires an accurate process model to select the appropriate control action. There are two approaches to creating an accurate model: (1) first-principles modeling and (2) data-driven modeling. Creating and maintaining first-principles models is very resource-intensive. On the other hand, building a nonlinear data-driven model requires informative data and constant re-tuning, which is both challenging and costly. Additionally, implementing nonlinear MPC (NMPC) introduces significant computational complexity because it involves solving a nonconvex optimization problem that is very difficult to solve optimally. Therefore, a practical approach is linear MPC, which is more common, employs a linear state-space model, and achieves acceptable closed-loop performance. However, this linearity may fail to control the system effectively when facing disturbances or in complex, nonlinear systems [2].
Model-free reinforcement learning (RL) has emerged as a promising alternative to NMPC. To train and control the system, an RL agent requires random exploration to find the optimal policy. However, such exploration is not feasible in a chemical engineering plant, where maintaining high-quality products and ensuring safety are the top priorities. An alternative approach is to pre-train the agent offline, enabling restricted exploration that does not compromise plant goals. Nonetheless, the challenges associated with MPC persist, as pre-training still demands an accurate system model. Since linear MPC is commonly available, why not start there? Can this limited data be used to pre-train an RL agent before applying it to the real plant to avoid random exploration?
Our group previously leveraged an existing linear MPC to design an RL agent controller [3]. In the proposed method, a linear MPC optimization formula is used to pre-train the agent. The RL agent, which has imitated the performance of a linear MPC, will be applied to the real plant and will then safely learn the system's non-linearity and dynamics using restricted random exploration to avoid safety hazards. Building upon this, the current study aims to demonstrate the applicability of this RL approach for large-scale industrial chemical processes. Additionally, to broaden the algorithm's applicability, we incorporate output and system safety constraints. The efficacy of the safe RL method is evaluated through two case studies: (1) control of a polymerization reactor [4], and (2) distillation column control using an ethylene splitter (C2 splitter) in ASPEN Dynamics. An industrial step-response-based MPC is used for pre-training, and real-time simulations demonstrate the effectiveness of the RL-based control strategy.
References
[1] Wahid, A. and Prasetyo, A.P., 2018, March. A Comparative study between MPC and PI controller to control vacuum distillation unit for producing LVGO, MVGO, and HVGO. In IOP Conference Series: Materials Science and Engineering (Vol. 334, No. 1, p. 012020). IOP Publishing.
[2] Khather, S.I., Ibrahim, M.A. and Abdullah, A.I., 2023. Review and Performance Analysis of Nonlinear Model Predictive Control--Current Prospects, Challenges and Future Directions. Journal Européen des Systèmes Automatisés, 56(4).
[3] Hassanpour, H., Wang, X., Corbett, B. and Mhaskar, P., 2024. A practically implementable reinforcement learning‐based process controller design. AIChE Journal, 70(1), p.e18245.
[4] Bustos, G.A., Ferramosca, A., Godoy, J.L. and González, A.H., 2016. Application of model predictive control suitable for closed-loop re-identification to a polymerization reactor. Journal of Process Control, 44, pp.1-13.