2025 AIChE Annual Meeting

(644h) Development of a Framework for Automatic Discovery of Optimal Hybrid First Principles - Machine Learning Models

Authors

Angan Mukherjee - Presenter, West Virginia University
Nishant Vinayak Giridhar, West Virginia University
Debangsu Bhattacharyya, West Virginia University
First-principles (FP) models can provide very good predictive capabilities even in absence of data availability or for cases where data collection is infeasible given the current state of measurement technology. However, developing accurate FP models for complex nonlinear dynamic process systems may be computationally expensive, unscalable, time/resource consuming, and intractable for online adaptation. Furthermore, they require extensive specialized domain knowledge to formulate and often must be simplified by suitable assumptions to solve reliably. On the contrary, machine learning (ML) or black-box models are relatively easier to develop, simulate, and adapt online, even for complex and ill-defined processes1,2. But one of the primary limitations of conventional data-driven models stems from the lack of predictive or extrapolative capabilities especially when there is an information gap in the data used for model development. One way of synergistic utilization of both FP and ML models by exploiting their strengths is by developing hybrid first principles – machine learning (FPML) models3,4. However, existing state-of-the-art approaches for hybrid modeling are typically derived from domain knowledge and / or heuristics. There is a critical need for developing a systematic method for synthesis of FPML models. This work develops a novel systematic framework for automatic discovery of the optimal structure for a hybrid FPML model by using disjunctive programming-based superstructure optimization5–7.

The proposed approach introduces a generic formulation for superstructure of hybrid models represented by binary decision variables that encode data flow between measured, intermediate, and predicted variables. These binary arrays serve as mapping matrices that determine which state variables need to be used as inputs or outputs for the ML model. The structure of the ML model, including the size of the input and output layers and thus the number of trainable parameters, is implicitly governed by the superstructure decision variables. The proposed framework remains agnostic to the specific ML model used. The complexity of the hybrid model is optimized given observational data through an information-theoretical criterion 8,9 used as the objective function. The proposed framework can accommodate FP models of various levels of rigor/accuracy that can stem forth from aleatoric and epistemic uncertainties such as those arising from model form discrepancy, parametric uncertainty, and measurement noise. We employ a generalized disjunctive programming (GDP) formulation that embeds hierarchical superstructure decisions with logic for the optimal synthesis of FPML. The developed approach has been applied to hybrid modeling of several nonlinear systems. As an example, it is applied to hybrid modeling of a dynamic chemical reactor with known reaction kinetics but uncertain model parameters (e.g., rate constants, reaction orders, etc.) and noisy/biased predictions. It is also applied to an industrial superheater system where the FP model suffers from model form uncertainty. It has been observed that the proposed algorithm can automatically identify the hybrid structures that are non-intuitive and can be time-consuming or difficult to identify through heuristic methods. This work also provides a systematic first step towards the automated discovery of optimal hybrid FPML models, with applications in real-time monitoring, soft sensing, and control of complex process systems where mechanistic knowledge and data science can be synergistically coupled to refine model predictions.

References

1. Mukherjee, A. & Bhattacharyya, D. Development of Steady-State and Dynamic Mass and Energy Constrained Neural Networks for Distributed Chemical Systems Using Noisy Transient Data. Ind Eng Chem Res 63, 14211–14239 (2024).

2. Mukherjee, A. & Bhattacharyya, D. Hybrid Series/Parallel All-Nonlinear Dynamic-Static Neural Networks: Development, Training, and Application to Chemical Processes. Ind Eng Chem Res 62, 3221–3237 (2023).

3. Mukherjee, A. et al. Development of hybrid first principles – artificial intelligence models for transient modeling of power plant superheaters under load-following operation. Appl Therm Eng 262, 124795 (2025).

4. Shah, P., Pahari, S., Bhavsar, R. & Kwon, J. S.-I. Hybrid modeling of first-principles and machine learning: A step-by-step tutorial review for practical implementation. Comput Chem Eng 194, 108926 (2025).

5. Perez, H. D. & Grossmann, I. E. Extensions to generalized disjunctive programming: hierarchical structures and first-order logic. Optimization and Engineering 25, 959–998 (2024).

6. Pistikopoulos, E. N. & Tian, Y. Advanced Modeling and Optimization Strategies for Process Synthesis. Annu Rev Chem Biomol Eng 15, 81–103 (2024).

7. Liñán, D. A. & Ricardez‐Sandoval, L. A. A Benders decomposition framework for the optimization of disjunctive superstructures with ordered discrete decisions. AIChE Journal 69 (2023).

8. Mukherjee, A. & Bhattacharyya, D. On the Development of Steady-State and Dynamic Mass-Constrained Neural Networks Using Noisy Transient Data. Comput Chem Eng 187, 108722 (2024).

9. Adeyemo, S. & Bhattacharyya, D. Optimal nonlinear dynamic sparse model selection and Bayesian parameter estimation for nonlinear systems. Comput Chem Eng 180, 108502 (2024).