2024 AIChE Annual Meeting
(711f) A Constraint on Classical Least Squares for Preprocessing Vibrational Spectra Containing Overlapping Non-Target Peaks
NCCLS preprocessing maintains accuracy of existing regression or classification models when non-target species appear in mixture spectra, thereby reducing the need for calibration sets that include non-target species. Additionally, some non-target species may not be known before a system or plant is operational. Creating chemical calibration sets that include all possible non-target species may be both resource-intensive and time-consuming because the number of calibration experiments increases exponentially with the number of chemical constituents when using common design of experiments such as a full-factorial design. Therefore, removing non-target peaks from mixture spectra can reduce the economic burden of using spectroscopic equipment while also improving its robustness. NCCLS functions by removing non-target species from mixture spectra, allowing an established quantification pipeline to be unaffected by the incorporation of NCCLS as a preprocessing step; NCCLS can operate in conjunction with existing regression or classification models.
NCCLS assumes that references of the target species are available, non-target species have nonnegative spectral contributions, and measurement error is independent and identically distributed between training data and process data. The applied constraint is physically motivated based on the response of all species having nonnegative spectral contributions, including non-target species; all species obeying the Beer-Lambert law or similar linearity assumption cannot have negative spectral contributions. The constraint introduced in NCCLS is applied to least-squares optimization, creating a quadratic programming problem that is solved using IBM’s CPLEX solver in the Pyomo optimization environment. The work presented here applies the NCCLS approach to both a computational study of overlapping spectral peaks and a physical multicomponent system of aqueous sodium salts typical of nuclear waste at the Hanford Site in Washington State. The Hanford process is expected to have inconsistent feed stream compositions, leading to the need for real-time process monitoring. Two in-line sensors are investigated for application at the Hanford site: Raman spectroscopy and Attenuated Total Reflectance – Fourier Transform Infrared (ATR-FTIR) spectroscopy. Both are vibrational spectroscopies that can benefit from an NCCLS approach to spectra preprocessing in complex mixtures. ATR-FTIR derives its linearity from the Beer-Lambert Law, while Raman scattering is analogously linear under certain conditions [1].
There are existing methods that may preprocess spectra by removing non-target species. To evaluate the performance of NCCLS, alternative preprocessing methods are investigated: principal component analysis, a convolutional denoising autoencoder, two blind source separation methods, and spectral residual augmented classical least squares [2]. A key finding of this study is that NCCLS provides comparable or superior preprocessing performance to all of the methods in the present study. Most notably, NCCLS outperforms comparative methods in real-time monitoring scenarios that may be typical of industrial monitoring tasks.
NCCLS has the potential to enable robust real-time monitoring of multicomponent processes where the system composition may change with time, as is shown in this work for sodium salt solutions typical of the Hanford site. In particular, NCCLS performs effective non-target removal when target and non-target species overlap in the associated spectrum. Alternative methods exist for spectral preprocessing, but lack effectiveness in real-time monitoring scenarios. NCCLS may be a tool engineers and chemists use to reduce the size (and cost) of calibration data sets and to maintain monitoring accuracy in the face of changing process conditions or environmental factors.
Figure 1: Parity plots showing the prediction accuracy from in silico spectra undergoing various preprocessing techniques (including the method of this work, NCCLS) in a real-time monitoring scenario (attached image).
References
[1] R. L. McCreery, Raman Spectroscopy for Chemical Analysis, vol. 157. 2000.
[2] D. M. Haaland and D. K. Melgaard, “New augmented classical least squares methods for improved quantitative spectral analyses,” Vib. Spectrosc., vol. 29, no. 1–2, pp. 171–175, 2002, doi: 10.1016/S0924-2031(01)00199-0.