2025 AIChE Annual Meeting

(207c) Thermodynamic Surrogate Modeling for Open-Source Pharmaceutical Process Simulations

Authors

Salvador Garcia Munoz, Eli Lilly and Company
Alexander Dowling, University of Notre Dame
Computational efficiency is a significant challenge in pharmaceutical process simulations, where accurate thermodynamic models are needed for drug formulation, purification, and separation processes [1,2]. However, evaluating complex thermodynamic models can be computationally expensive, limiting their use in large-scale simulations. This work presents a hybrid modeling framework that integrates Gaussian process (GP)-based thermodynamic surrogate models into process simulations, addressing this challenge. Our contributions span software integration by linking open-source tools to enhance process simulation capabilities and methodological advancements through improved surrogate modeling strategies.

Regarding software, we developed a GP surrogate for activity coefficient predictions in a water-ethanol mixture as a case study. The surrogate was embedded in a distillation process modeled using PharmaPy [3], an open-source process simulation tool widely used in pharmaceutical modeling. A key innovation of this work is the linking of PharmaPy with Clapeyron.jl [4], a high-performance thermodynamic library implemented in Julia. This connection enables PharmaPy to incorporate custom thermodynamic models, improving its flexibility for pharmaceutical process simulations. This study demonstrates this capability using SAFT-γ Mie [5], a thermodynamic model that employs statistical associating fluid theory (SAFT) [6,7,8] with Mie potentials to accurately describe phase behavior and intermolecular interactions in complex fluid systems. This integration is a stepping stone toward embedding a broader range of thermodynamic models into PharmaPy for pharmaceutical unit operations.

Regarding methodology, we employed maximum entropy model-based design of experiments (MBDOE) [9,10] to systematically construct the GP surrogate model, ensuring that the most informative thermodynamic states were sampled. A key contribution of this work is the derivation of a closed-form information-theoretic acquisition function, enabling efficient Bayesian optimization of hierarchical GP models [11]. This approach optimized the surrogate model’s accuracy while minimizing data requirements, ultimately enhancing its effectiveness in process simulations.

[1] Salo-Ahen, O. M. H., Alanko, I., Bhadane, R., Bonvin, A. M. J. J., Honorato, R. V., Hossain, S., Juffer, A. H., Kabedev, A., Lahtela-Kakkonen, M., Larsen, A. S., Lescrinier, E., Marimuthu, P., Mirza, M. U., Mustafa, G., Nunes-Alves, A., Pantsar, T., Saadabadi, A., Singaravelu, K., & Vanmeert, M. (2021). Molecular dynamics simulations in drug discovery and pharmaceutical development. Processes, 9(1), 71. https://doi.org/10.3390/pr9010071

[2] Ierapetritou, M. G., & Ramachandran, R. (Eds.). (2016). Process simulation and data modeling in solid oral drug development and manufacture. Humana Press. https://doi.org/10.1007/978-1-4939-2996-2

[3] Casas-Orozco, D., Laky, D., Wang, V., Abdi, M., Feng, X., Wood, E., Laird, C., Reklaitis, G. V., & Nagy, Z. K. (2021). PharmaPy: An object-oriented tool for the development of hybrid pharmaceutical flowsheets. Computers & Chemical Engineering, 153, 107408. https://doi.org/10.1016/j.compchemeng.2021.107408

[4] Walker, P. J., Yew, H.-W., & Riedemann, A. (2022). Clapeyron.jl: An extensible, open-source fluid thermodynamics toolkit. Industrial & Engineering Chemistry Research, 61(20), 7130–7153. https://doi.org/10.1021/acs.iecr.2c00326

[5] Papaioannou, V., Lafitte, T., Avendaño, C., Adjiman, C. S., Jackson, G., Müller, E. A., & Galindo, A. (2014). Group contribution methodology based on the statistical associating fluid theory for heteronuclear molecules formed from Mie segments. Journal of Chemical Physics, 140(5), 054107. https://doi.org/10.1063/1.4851455

[6] Chapman, W. G., Jackson, G., & Gubbins, K. E. (1988). Phase equilibria of associating fluids: Chain molecules with multiple bonding sites. Molecular Physics, 65(5), 1057–1079. https://doi.org/10.1080/00268978800101601

[7] Chapman, W. G., Gubbins, K. E., Jackson, G., & Radosz, M. (1989). SAFT: Equation-of-state solution model for associating fluids. Fluid Phase Equilibria, 52, 31–38.
https://doi.org/10.1016/0378-3812(89)80308-5

[8] Chapman, W. G., Gubbins, K. E., Jackson, G., & Radosz, M. (1990). New reference equation of state for associating liquids. Industrial & Engineering Chemistry Research, 29(8), 1709–1721. https://doi.org/10.1021/ie00104a021

[9] Shewry, M. C., & Wynn, H. P. (1987). Maximum entropy sampling. Journal of Applied Statistics, 14(2), 165–170. https://doi.org/10.1080/02664768700000020

[10] Currin, C., Mitchell, T., Morris, M., & Ylvisaker, D. (1991). Bayesian prediction of deterministic functions, with applications to the design and analysis of computer experiments. Journal of the American Statistical Association, 86(416), 953–963. https://doi.org/10.2307/2290511

[11] Lalchand, V., & Rasmussen, C. E. (2019). Approximate inference for fully Bayesian Gaussian process regression. Proceedings of Machine Learning Research, 118, 1–12. https://doi.org/10.48550/arXiv.1912.13440