2024 AIChE Annual Meeting
(372ab) Development of a Model Based on Deep Reinforcement Learning, Lstm Network, and TD3 Algorithm for Optimization and Control of MAb Production in Mammalian Cell Cultures
The efficient performance of these cell cultures requires highly specialized culture media to enhance MAb yield for in vitro production in view of substantial cell death and reduced MAb productivity due to the variations in culture conditions. Although some production practices have been employed for decades, cell kinetics is still under investigation to obtain quantitatively as well as qualitatively cost-effective production strategies. Creating these strategies requires understanding of cell metabolism affected by process dynamics in applicable culture environments. Kinetic models empower us to quantitatively illustrate cell growth and metabolic activity, which allows prediction of different cell phenotypes and provides better understanding of cell physiology, which is important in optimization of MAb production in animal cell cultures. Kinetic descriptions of mammalian cell cultures are difficult due to these complexities. Development of first principles models (FPMs) is tedious. While FPMs provide qualitative understanding of the cell culture processes, their application in dynamic operations for prediction, optimization and feedback control of mammalian cell cultures is limited due to their rigid structure. A recent focus has been on using statistical methods for modeling mammalian cell cultures. This approach has limited success due to it being restrictive to particular strains. Cell culture processes are complex and dynamic, which creates difficulties in managing and interpreting the large amount of generated data. This highlights the need for advanced predictive models and control. Optimizing mammalian cell cultures, particularly CHO cells, is crucial for efficient MAb production. Data-driven models have shown promise in understanding and controlling these complex biological processes. The data-driven models are well suited for design, optimization, and scaling of bioreactors used for MAb production and model predictive control of these.
The optimization of mammalian cell cultures is a crucial step in biopharmaceutical production. However, the nonlinearity and complexity of cell cultures make it difficult to model and optimize these processes. In this work, we demonstrate the capabilities of Deep Reinforcement Learning (DRL) to enhance MAb production in mammalian cell cultures, particularly focusing on the dynamic nature of biological processes. By integrating a Long Short-Term Memory (LSTM) network with the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm, we tailor our approach to effectively manage the continuous and complex variables involved in bioprocess control and optimization tasks. The LSTM method uses neural network algorithms to model and analyze sequential data and is capable of handling large and nonlinear experimental databases. This method has been improved in the past decade, and with the availability of large databases and better computing power, we are able to model and analyze mammalian cell cultures. This LSTM-TD3 model handles the temporal dependencies and nonlinear dynamics characteristic of CHO cell cultures, facilitating precise real-time adjustments of nutrient feeds and environmental conditions to optimize MAb production. Our approach uses the reinforcement learning environment in conjunction with historical process data and in-silico data generated from FPMs. This method ensures a robust and predictive framework that is finely tuned to the specific requirements of bioprocess optimization, offering a more accurate and comprehensive simulation of biological processes. The integration of historical process data and FPMs for in silico data generation with LSTM-TD3 model represents a significant jump in bioprocess control, promising to elevate efficiency and product quality in MAb manufacturing. The results of detailed simulations with the LSTM-TD3 model underscore potential of the LSTM-TD3 model to transform bioprocessing strategies. This approach not only demonstrates the model’s effectiveness in handling complex bioprocessing tasks but also sets a new standard for the application of DRL in the biopharmaceutical industry, promising substantial improvements in the scalability and sustainability of MAb production. The model simulations predict the trends observed in batch and fed-batch mammalian cell cultures for key nutrients glucose and glutamine, viable cell density, target product (MAb) titer, and inhibitory metabolites lactate and ammonia with high accuracy. The LSTM-TD3 model predicts the trajectories of these process variables satisfactorily and with sufficient accuracy and reliability.