2025 AIChE Annual Meeting

(10d) Deep Learning Models with Hard Linear Inequality Constraints

Authors

Hao Chen - Presenter, Purdue University
Gonzalo Constante-Flores, Purdue University
Over recent years, deep learning has achieved remarkable success across a wide range of applications by uncovering patterns within data [1-5]. Despite their expressivity and efficiency, deep learning models still struggle to strictly satisfy constraints that exist between the inputs and outputs. This limitation arises from the fact that these models are typically trained as unconstrained optimization problems. Although the research into physics-informed neural networks (PINNs) has been explored extensively [6, 7], these methods only provide soft constraints by minimizing the prediction errors and constraint violations simultaneously. This is equivalent to a multi-objective optimization and usually compromises predictive accuracy [8]. As a result, the prior knowledge regarding the data is not fully utilized in the learning process.

More importantly, the lack of hard constraint satisfaction prohibits the use of deep learning models in certain domains, as minor constraint violations sometimes can result in cascading errors and unexpected failures [9]. This issue becomes particularly problematic in high-stakes applications, such as model predictive control (MPC) where the feasibility of decisions is critical to ensuring the safety of chemical plants. For example, even though neural networks, in principle, can be trained to directly approximate the optimal control action inside the MPC, they are usually trained to recast as a predictive model instead [10-12]. This is because a vanilla neural network will produce near-optimal but infeasible control action in practice. In contrast, one needs to solve a constrained optimization problem embedded with neural networks to ensure the feasibility of control action, which can be impractical for areas where an instant response is essential during online implementation. Though the computational overhead can be addressed by multi-parametric programming, the complexity of solving the multi-parametric programming problems still grows exponentially with the size of the dimensions [13-17].

In this work, we introduce an architecture to enforce hard linear constraints in deep learning models. The proposed architecture, which enforces input-dependent constraints that are linear in the outputs, entails two networks: (i) a feasibility network based on a decision rule that maps the input into feasible—but potentially conservative—outputs, and (ii) an optimality network, which can adopt any deep learning architecture and is tasked with minimizing the loss function while satisfying the linear equality constraints through a projection layer [18]. The parameters of the feasibility network and the projection layer are computed offline by solving two tractable optimization problems. The final output is computed as a convex combination of the outputs of both networks, guaranteeing that all constraints are satisfied during both training and inference. Unlike existing methods, our approach is non-iterative, enabling fast online inference at minimal computational overhead.

We evaluate the architecture both as a proxy for solving optimization problems with varying left- and right-hand side parameters, and across a wide range of reinforcement learning tasks to enforce linear constraints on observation–action pairs. Numerical experiments highlight the proposed model’s performance in terms of data efficiency, prediction accuracy, and inference speed.

[1] I. Fahmi and S. Cremaschi, "Process synthesis of biodiesel production plant using artificial neural networks as the surrogate models," Computers & Chemical Engineering, vol. 46, pp. 105-123, 2012/11/15/ 2012, doi: https://doi.org/10.1016/j.compchemeng.2012.06.006.

[2] S. Qin, S. Jiang, J. Li, P. Balaprakash, R. C. Van Lehn, and V. M. Zavala, "Capturing molecular interactions in graph neural networks: a case study in multi-component phase equilibrium," Digital Discovery, 10.1039/D2DD00045H vol. 2, no. 1, pp. 138-151, 2023, doi: 10.1039/D2DD00045H.

[3] N. Triantafyllou et al., "Machine learning-based decomposition for complex supply chains," in Computer Aided Chemical Engineering, vol. 52, A. C. Kokossis, M. C. Georgiadis, and E. Pistikopoulos Eds.: Elsevier, 2023, pp. 1655-1660.

[4] M. Di Martino, S. Avraamidou, and E. N. Pistikopoulos, "A Neural Network Based Superstructure Optimization Approach to Reverse Osmosis Desalination Plants," Membranes, vol. 12, no. 2, doi: 10.3390/membranes12020199.

[5] T. McDonald, C. Tsay, A. M. Schweidtmann, and N. Yorke-Smith, "Mixed-integer optimisation of graph neural networks for computer-aided molecular design," Computers & Chemical Engineering, vol. 185, p. 108660, 2024/06/01/ 2024, doi: https://doi.org/10.1016/j.compchemeng.2024.108660.

[6] W. Bradley et al., "Perspectives on the integration between first-principles and data-driven modeling," Computers & Chemical Engineering, vol. 166, pp. 107898-107898, 2022, doi: https://doi.org/10.1016/j.compchemeng.2022.107898.

[7] G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, and L. Yang, "Physics-informed machine learning," in Nature Reviews Physics vol. 3, ed: Springer Nature, 2021, pp. 422-440.

[8] A. Krishnapriyan, A. Gholami, S. Zhe, R. Kirby, and M. W. Mahoney, "Characterizing possible failure modes in physics-informed neural networks," in Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P. S. Liang, and J. W. Vaughan, Eds., 2021, vol. 34: Curran Associates, Inc., pp. 26548-26560. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2021/file/df438e5206f31600e6ae4af72f2725f1-Paper.pdf

[9] K. Ma et al., "Data-driven strategies for optimization of integrated chemical plants," Computers & Chemical Engineering, vol. 166, pp. 107961-107961, 2022, doi: https://doi.org/10.1016/j.compchemeng.2022.107961.

[10] J. Katz, I. Pappas, S. Avraamidou, and E. N. Pistikopoulos, "Integrating Deep Learning and Explicit MPC for Advanced Process Control," in 2020 American Control Conference (ACC), 1-3 July 2020 2020, pp. 3559-3564, doi: 10.23919/ACC45564.2020.9147582.

[11] M. S. Alhajeri, F. Abdullah, Z. Wu, and P. D. Christofides, "Physics-informed machine learning modeling for predictive control using noisy data," Chemical Engineering Research and Design, vol. 186, pp. 34-49, 2022, doi: https://doi.org/10.1016/j.cherd.2022.07.035.

[12] G. Wu, W. T. G. Yion, K. L. N. Q. Dang, and Z. Wu, "Physics-informed machine learning for MPC: Application to a batch crystallization process," Chemical Engineering Research and Design, vol. 192, pp. 556-569, 2023/04/01/ 2023, doi: https://doi.org/10.1016/j.cherd.2023.02.048.

[13] A. Bemporad, M. Morari, V. Dua, and E. N. Pistikopoulos, "The explicit linear quadratic regulator for constrained systems," Automatica, vol. 38, no. 1, pp. 3-20, 2002/01/01/ 2002, doi: https://doi.org/10.1016/S0005-1098(01)00174-1.

[14] E. N. Pistikopoulos, V. Dua, N. A. Bozinis, A. Bemporad, and M. Morari, "On-line optimization via off-line parametric optimization tools," Computers & Chemical Engineering, vol. 24, no. 2, pp. 183-188, 2000/07/15/ 2000, doi: https://doi.org/10.1016/S0098-1354(00)00510-X.

[15] V. Sakizlis, N. M.P. Kakalis, V. Dua, J. D. Perkins, and E. N. Pistikopoulos, "Design of robust model-based controllers via parametric programming," Automatica, vol. 40, no. 2, pp. 189-201, 2004/02/01/ 2004, doi: https://doi.org/10.1016/j.automatica.2003.08.011.

[16] D. Narciso, N. Faísca, K. Kouramas, and E. Pistikopoulos, "Multi-Parametric Model-Based Control: Theory and Applications, Volume 2," 2011, pp. 77-103.

[17] N. A. Diangelakis, B. Burnak, J. Katz, and E. N. Pistikopoulos, "Process design and control optimization: A simultaneous approach by multi-parametric programming," AIChE Journal, vol. 63, no. 11, pp. 4827-4846, 2017/11/01 2017, doi: https://doi.org/10.1002/aic.15825.

[18] H. Chen, G. E. C. Flores, and C. Li, "Physics-informed neural networks with hard linear equality constraints," Computers & Chemical Engineering, vol. 189, p. 108764, 2024/10/01/ 2024, doi: https://doi.org/10.1016/j.compchemeng.2024.108764.