2024 AIChE Annual Meeting
(14g) Data-Driven Development of Advanced Controllers for Complex Reaction Systems with Minimal Prior Information
Authors
In response to this challenge, this work focuses on developing a comprehensive data-driven modelling and control framework that adapts to the variability of complex feedstocks. The adoption of online reaction monitoring using spectroscopic methods has emerged as a promising approach with its rapidity, non-invasiveness, non-destructiveness, and cost-effectiveness, showing tremendous potential due to methods available for spectral analysis. This study is designed to illustrate the development of an inferential model equipped with advanced controllers tailored for complex reactive systems based on spectroscopic sensing, but with minimal prior knowledge of species and reactions.
Joint Non-negative Tensorial Factorization[1] is employed to deconvolve mixture spectra to extract pseudo-component concentration and spectral profiles. This critical step provides a basis for hypothesizing reaction networks through the identification of functional groups within pseudo-components and the use of Bayesian networks to identify possible reaction pathways, providing essential directional constraints for subsequent modelling phases.
Once the mixture spectra have been deconvolved and the reaction network has been obtained, the next step is to create a kinetic model for the system. In the past, traditional black box models with a high number of parameters have been used for such systems, but these models lack interpretability. Therefore, this study relies on a hybrid modeling approach called Neural Ordinary Differential Equations (NODEs). [2]
NODEs are designed to follow the ODEs that cover the dynamics of the system. The NODE structure is constrained by the reaction network structure, the reactor dynamics (a CSTR in this case) and the pseudo-component concentration profiles derived from previous steps. This enables the Neural ODE to estimate the kinetics underpinning the process, accounting for activation energies, pre-exponential factors and reaction orders as weights through the structure.
A model using Long Short-Term Memory (LSTM) networks is presented as a benchmark for comparison. This is a black-box modelling approach, in constrast to the NODE, which is a hybrid modeling approach. In our study, the LSTM has been designed as a 5-step ahead predictor. Note that the LSTM also is trained on the deconvolved pseudo-component spectra and their concentrations, meaning that it is a black box approach only in terms of identifying kinetics.
Finally, we compare two controllers with the NODE model: Model Predictive Control (MPC) and Reinforcement Learning. The Model Predictive Control (MPC) maximizes one species’ concentration while minimizing the other while laying a penalty on large input deviations. We use Deep Deterministic Policy Gradients (DDPG) as the RL method, which is a blend of value-based and policy-based algorithms that use two neural networks, i.e. Actor and Critic, to specifically predict actions and rewards corresponding to these actions. The learning is governed by exploiting the gradients of reward with respect to the action space, maximizing the reward at the end of training. The two control strategies are compared under similar conditions of control objectives and reward functions for setpoint tracking and regulatory control problems related to selectivity with respect to a desired product. The Model Predictive Control (MPC relies on optimization for this purpose, while the DDPG RL framework learns optimal control strategies within the same objectives and formulations by autonomously discovering efficient trajectories.
MPC effectively meets control objectives through precise tuning, despite occasional challenges with derivative calculations due to model stiffness. In contrast, RL can align with control goals but requires carefully designed rewards to accommodate various setpoints and balance the significance of states and inputs, leading to extended training times. Although RL offers adaptability, its generalizability for control tasks does not yet match the robustness of MPC in handling a wide range of operational scenarios.
Thus, we present an end-to-end framework for process control and monitoring purely from spectroscopic sensing data with minimal prior information about the species and reactions in the system. The methodology is agnostic to the specifics of the system being studied and only assumes that appropriate spectroscopic sensors have been chosen, i.e., that they contain some information on functional groups. In our view, this represents one of the first attempts to develop an end-to-end framework with such minimal prior information about the system.
References:
- Puliyanda, A., Sivaramakrishnan, K., Li, Z., De Klerk, A., & Prasad, V. (2021). Structure-Preserving Joint Non-negative Tensor Factorization to Identify Reaction Pathways Using Bayesian Networks. Journal of Chemical Information and Modeling, 61(12), 5747–5762. https://doi.org/10.1021/acs.jcim.1c00789
- Puliyanda, A., Srinivasan, K., Li, Z., & Prasad, V. (2023). Benchmarking chemical neural ordinary differential equations to obtain reaction network-constrained kinetic models from spectroscopic data. Engineering Applications of Artificial Intelligence, 125, 106690. https://doi.org/10.1016/j.engappai.2023.106690