2025 AIChE Annual Meeting

(362c) Comparative Evaluation of Partial Least Squares and Machine Learning for Monitoring Small Step Changes in API in 50 Wt.% Formulation Using NIR within a Feed Frame

Authors

Carlos Ortega-Zuniga, Rutgers University
James Scicolone, Rutgers University
Fernando Muzzio, Rutgers University
Driven by the Quality-by-Design (QbD) initiative and advancements of Industry 4.0, the pharmaceutical industry is shifting from traditional batch manufacturing to advanced, integrated processes. This evolution calls for predictive modeling, process optimization, and real-time quality control. A key QbD goal is ensuring drug content uniformity. Process Analytical Technology (PAT) enables real-time quality assurance using spectral data to build predictive chemometric models. While Partial Least Squares (PLS) regression is commonly used, its linear nature limits performance in capturing multidimensional, nonlinear relationships typical of pharmaceutical processes. Machine learning (ML) offers advantages such as the ability to model non-linear relationships, handle high-dimensional spectral data, and learn from patterns without requiring predefined assumptions. This project compared ML techniques, specifically Random Forest regression (RF, decision trees built on random subsets of data) and Artificial Neural Networks (ANNs, interconnected layers of neurons), versus PLS for real-time PAT applications in pharmaceutical manufacturing. A case study examined monitoring small step changes (0.5, 1.0, and 3.0 %w/w) in active pharmaceutical ingredient (API) concentration within a high-dose formulation (50 %w/w) using near-infrared spatially resolved spectroscopy (NIR-SRS) in a feed-frame. NIR spectral data were preprocessed using first derivative transformation to reduce baseline effects, and outliers were removed via Z-score and interquartile range methods. ANNs provided superior predictive accuracy over RF models, effectively capturing small step changes and transitions. Additionally, ANNs showed comparable performance to traditional PLS models in capturing process dynamics and smooth transitions across setpoints. This study demonstrates the potential of ML methods as data-driven approaches for process modeling in advanced pharmaceutical manufacturing.