2024 AIChE Annual Meeting

Machine Learning Driven Quantitation of Viable Capsids for Gene Therapy Applications

Viral vectors are a promising treatment option for a range of genetic disorders and diseases due to their ability to effectively deliver therapeutic genes to specific cells. However, large-scale production remains challenging, as viral vectors must not only be efficient but also safe to administer to patients. Currently, transmission electron microscopy (TEM) is the standard for high-fidelity quality control, allowing trained technicians to directly visualize and count viral capsids. This manual process of counting viable capsids is both time-consuming and prone to human error, which poses a challenge, especially when scaling up the production of potentially life-saving therapies.

To address this bottleneck, this study explores the application of machine learning algorithms to automate the quantitation of viable capsids in gene therapy applications. The first step involved preprocessing data by creating image patches from original TEM images, with each patch containing a single capsid. Using the Computer Vision Annotation Tool (CVAT), all capsids were labeled as "full," "partial," or "empty," representing a simplified classification of capsid fill values. Following the labeling step, a balanced dataset of these annotated capsid image patches was created and subjected to thorough data analysis, where meaningful features were extracted and applied to various machine learning algorithms. During feature engineering, the introduction of certain features—particularly one that compares the symmetry of image patches—yielded promising results, although further investigation is required to confirm its significance. Ultimately, this study aims to reduce the time and error rate associated with TEM-based manual quantitation of capsids, thereby streamlining the production of gene therapies.