2025 AIChE Annual Meeting

(261e) A Meta-Learning Approach for Few-Shot Bayesian Optimization of Batch Chemical Processes

Authors

Becky Langdon - Presenter, Imperial College London
Jixiang Qing, Imperial College London
Robert M. Lee, BASF SE
Mark van der Wilk, Imperial College London
Calvin Tsay, Imperial College London
Bayesian Optimisation (BayesOpt) is a powerful tool for global optimisation of expensive-to-measure functions and has found increasing applications in chemical engineering, including design of experiments and optimal process and material design [1-2]. Traditional BayesOpt typically harnesses Gaussian Processes (GPs) as surrogate models to approximate and determine how to sample these complex black-box functions. However, GPs have limited interpretability and lack model structure inherent to time-varying chemical processes. Moreover, GPs face challenges in forecasting when training data do not fully span the test space and struggle to generalize across tasks where the underlying task parameters change but remain unobserved.

This work considers the optimisation of fed-batch processes, an expensive endeavour made difficult by the inherent, expensive-to-measure fluctuations from batch to batch [3]. These fluctuations are driven by changes in the stochastic underlying parameters driving the reaction such as biological growth rates or saturation constants. Parameter estimation techniques can evaluate these parameter values within batch, assuming a given ODE structure, but this evaluation can be expensive, and these methods require new data for each batch [4]. On the other hand, recent advancements in ML can mitigate these issues for dynamical systems. Neural processes, or NPs, are a class of neural latent variable models which meta-learn a distribution over functions to generalise across unobserved task information [5]. Like many meta-learning models, performance on individual tasks can be hindered by underfitting issues, for which various adaptations have been proposed, e.g., TNPs, CNPs [6-7]. Additionally, these models are domain-agnostic and do not incorporate structure inherent to the problem. Neural ODE Processes, or NODEPs, introduce and learn a dual encoding, representing initial conditions and dynamic parameters, and evolve the latent space to incorporate the desired ODE structure [8]. More recently, we introduce System-Aware Neural ODE Processes, or SANODEPs, which adapt NODEPs’ context structure and training process to learn the necessary trajectory-aware tasks; forecasting and interpolation alongside a novel acquisition function for BayesOpt [9]. For tasks such as fed-batch processes, SANODEPs can explicitly learn dynamics, including batch-to-batch variations using its trajectory-aware structure.

In this work, we investigate few-shot BO on fed-batch penicillin production; comparing SANODEP to traditional (problem-agnostic) GP implementations. A representative model is selected from the literature, with stochastic parameters identified and assigned a uniform prior distribution centred on nominal values [10]. Trajectories spanning this distribution are used to train SANODEP. At test time, BayesOpt is performed on a randomly sampled ‘ground truth’ task, i.e., a specific set of stochastic parameters, comparing SANODEP’s and GP’s performance as the surrogate model for few-shot and global optimisation. This approach highlights the ability of SANODEP to learn a broad distribution of ODEs, leading to superior performance in few-shot optimisation. As expected, GPs perform better in global optimisation due to the underfitting issues inherent in meta-learning methods. Further study is done on the impact of the width the prior and the impact of test data falling outside the prior. This work highlights the need for thoughtful design of the surrogate model and BayesOpt strategy, as well as the potential for meta-learning, to achieve efficient and reliable optimisation in dynamic chemical processes.

References:

[1] Shahriari B, Swersky K, Wang Z, Adams RP, de Freitas N. Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proceedings of the IEEE. 2016;104(1):148-175.

[2] Paulson, J. A., & Tsay, C. (2024). Bayesian optimization as a flexible and efficient design framework for sustainable process systems. Current Opinion in Green and Sustainable Chemistry, 100983

[3] Shokry, A., Vicente, P., Escudero, G., Pérez-Moya, M., Graells, M., & Espuña, A. (2018). Data-driven soft-sensors for online monitoring of batch processes with different initial conditions. Computers & Chemical Engineering, 118, 159-179.

[4] Zavala, V. M., Laird, C. D., & Biegler, L. T. (2008). Interior-point decomposition approaches for parallel solution of large-scale nonlinear parameter estimation problems. Chemical Engineering Science, 63(19), 4834-4845.

[5] Garnelo, M., Schwarz, J., Rosenbaum, D., Viola, F., Rezende, D. J., Eslami, S. M., & Teh, Y. W. (2018). Neural processes. arXiv preprint arXiv:1807.01622.

[6] Garnelo, M., Rosenbaum, D., Maddison, C., Ramalho, T., Saxton, D., Shanahan, M., Teh, Y. W., Rezende, D., & Eslami, S. A. (2018, July). Conditional neural processes. In International conference on machine learning (pp. 1704-1713). PMLR

[7] Nguyen, T., & Grover, A. (2022). Transformer neural processes: Uncertainty-aware meta learning via sequence modeling. arXiv preprint arXiv:2207.04179.

[8] Norcliffe, A., Bodnar, C., Day, B., Moss, J., & Liò, P. (2021). Neural ode processes. arXiv preprint arXiv:2103.12413.

[9] Qing, J., Langdon, B. D., Lee, R. M., Shafei, B., van der Wilk, M., Tsay, C., & Misener, R. (2024). System-Aware Neural ODE Processes for Few-Shot Bayesian Optimization. arXiv preprint arXiv:2406.02352.

[10] Bajpai, R. K., & Reuss, M. (1980). A mechanistic model for penicillin production. Journal of Chemical Technology and Biotechnology, 30(1), 332-344.