2025 AIChE Annual Meeting
(223d) Accelerating Biopharmaceutical Process Development through Automated Data Processing, Analysis, and Hybrid Dynamic Model Assembly
Author
In this work, an automated data processing and analysis workflow is presented, based upon the general requirements of a fed-batch CHO cell culture process for the manufacture of therapeutic monoclonal antibodies (mAbs). Outlier identification, missing value imputation, and extended feature engineering are systematically addressed, thus enabling enhanced exploration of all quantified state and process variables pertaining to the studied cell culture process, when coupled with multivariate analysis techniques. Hierarchical clustering and principal component analysis are applied within the automated framework, to enable deeper insight into the relationships between state and process variables, while supporting decisions regarding the structure of devised dynamic models.
To support development of hybrid dynamic models, which combine machine learning with mechanistic knowledge, recursive feature selection techniques were developed and applied within the data processing framework, to identify the most appropriate input parameters for data-driven and machine learning elements. Kinetic rates governing cell growth and metabolism, were defined on a cell-specific basis, and estimated directly from experimental data to support model training and validation. The devised automated workflow was trialled and tested on a series of CHO cell culture datasets, generated using a high-throughput AMBR250 cell culture system. The strategy yields a significant reduction in the time and resource requirements for dynamic model assembly, and supports model deployment within a process development context, while providing deeper insight into generated experimental data.
Furthermore, the systematic data processing framework facilitates efficient data management, ensuring structured storage, retrieval, and integration of cell culture process datasets when coupled with an advanced cloud-based knowledge management platform. Automated strategies for data processing, analysis and model deployment are aptly posed to advance the application of mathematical models during process development, contributing towards enhanced efficiency and accelerated development strategies. Ultimately, the presented framework paves the way for the advance of bioprocess digital twins, which aim to provide a virtual representation of the process, thus enabling enhanced monitoring, computational optimisation, and informed decision-making throughout development and manufacture.