2025 AIChE Annual Meeting

(223a) Virtual Clone: Machine Learning for Clone Selection in Cell Line Development

Authors

Elcin Icten Gencer, Amgen Inc.
William Heymann, Forschungszentrum Jülich
Jack Albright, Amgen Inc
Marzieh Ataei, Amgen Inc
Gita Pandey, Amgen Inc
Austin Xiong, Amgen Inc
Yi Li, Amgen
Edwige Gros, Amgen Inc
Fabrice Schlegel, Amgen Inc
Cell line development (CLD) is a critical yet resource-intensive stage in bioprocess development, traditionally relying on extensive empirical screening to identify high-performing clones. This conventional approach demands significant time and resources, creating bottlenecks in biopharmaceutical development timelines. To address these challenges, we have developed an innovative machine learning (ML)-based virtual clone strategy designed to optimize and expedite the cell line selection process.

Our approach integrates advanced analytics, employing both structured (tabular) and unstructured (image) datasets. Computer vision techniques are utilized to quantify and extract meaningful metrics from microscopic images of cells, enriching the traditional tabular datasets. These combined datasets undergo rigorous preprocessing involving domain-specific knowledge application, feature transformation, and automatic feature selection, significantly reducing feature complexity while preserving critical biological insights.

We evaluated multiple ML algorithms and data splitting strategies to determine optimal modeling approaches, resulting in highly predictive models that rival or exceed subject matter expert judgment. Our ML-driven models consistently predict high-performing clones with increased precision and have potential to significantly reduced timelines, allowing domain experts to refocus their efforts from routine analysis toward innovative experimental strategies.

This work represents a transformative shift in CLD by leveraging digital capabilities to automate and enhance decision-making. Ultimately, our virtual clone strategy has the potential to accelerate the selection of optimal clones, reduce resource demands, and sets a new standard for efficiency and consistency in biopharmaceutical development. By integrating machine learning deeply into the CLD workflow, we contribute to establishing a sustainable competitive advantage and advancing bioprocess optimization industry-wide.