2025 AIChE Annual Meeting

(126d) A Novel Hybrid Modeling Framework: Integration with Transformer-Based Approaches

Authors

Parth Shah - Presenter, Texas A&M University
Joseph Kwon, Texas A&M University
The chemical and biochemical industries continue to face persistent challenges in accurately modeling complex systems under varying operating conditions. Traditional first-principles models (FPMs) offer interpretability and extrapolation capabilities but often fall short in capturing latent phenomena due to incomplete system knowledge. Conversely, data-driven models excel in prediction accuracy within narrow domains but are constrained by data availability and lack of physical consistency. Hybrid modeling has emerged as a compelling approach that combines the strengths of both methodologies [1,2]. However, conventional hybrid models, often based on shallow neural networks or manually engineered features, still struggle with generalizability across process types and conditions.

Recent advancements in transformer-based architectures, originally developed for natural language processing, have opened new avenues for modeling structured chemical systems [2]. Transformers excel at learning context-rich representations and capturing long-range dependencies, making them an ideal candidate for complex, nonlinear chemical process modeling [3,4]. In this work, we propose a novel hybrid modeling framework that integrates a transformer encoder with a physics-based model for improved prediction under diverse operational regimes. The transformer component is pre-trained on a large corpus of simulated and experimental process trajectories and then fine-tuned to specific tasks using small, high-quality datasets [5]. Our architecture encodes process states and historical data using a transformer encoder, which generates latent embeddings capturing the underlying dynamics. These embeddings are fused with the outputs of a mechanistic model through residual connections and dense layers, allowing the hybrid model to adaptively correct the first-principles predictions while retaining physical consistency. By keeping the physics backbone intact and allowing the transformer to learn system-specific corrections, the approach effectively handles uncertainty, noise, and unmodeled phenomena, especially relevant in large-scale industrial systems.

The proposed framework is broadly applicable across chemical processes involving multiscale, nonlinear dynamics, such as crystallization, polymerization, reactor design, and separation systems [6]. As a demonstration of this capability, we apply the method to a case study in industrial fermentation, where transient oxygen gradients and microorganism activity pose a significant modeling challenge. Here, the transformer-based hybrid model outperformed both purely mechanistic and traditional neural hybrid models in predicting biomass growth, substrate uptake, and product formation under fed-batch operations. Notably, the model was able to generalize across multiple operational modes and time horizons, indicating strong potential for deployment in real-time optimization and control scenarios. This transformer-integrated hybrid framework represents a significant evolution in process modeling strategies, marrying interpretability with flexibility, and enabling chemical engineers to make smarter, data-informed decisions across a wide array of complex processes.

References:

  1. Shah, P., Pahari, S., Bhavsar, R., & Kwon, J. S. I. (2024). Hybrid modeling of first-principles and machine learning: A step-by-step tutorial review for practical implementation. Computers & Chemical Engineering, 108926.
  2. Daoutidis, P., Lee, J. H., Rangarajan, S., Chiang, L., Gopaluni, B., Schweidtmann, A. M., ... & Georgakis, C. (2024). Machine learning in process systems engineering: Challenges and opportunities. Computers & Chemical Engineering, 181, 108523.
  3. Sitapure, N., & Kwon, J. S. I. (2023). Exploring the potential of time-series transformers for process modeling and control in chemical systems: an inevitable paradigm shift?. Chemical Engineering Research and Design, 194, 461-477.
  4. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
  5. Haghighatlari, Mojtaba, et al. "Learning to Make Chemical Predictions: The Interplay of Feature Representation, Data, and Machine Learning Methods." Chem, vol. 6, no. 7, 2020, pp.
  6. Sitapure, N., & Sang-Il Kwon, J. (2023). Introducing hybrid modeling with time-series-transformers: A comparative study of series and parallel approach in batch crystallization. Industrial & Engineering Chemistry Research, 62(49), 21278-21291.