2025 AIChE Annual Meeting

(202c) AI-Powered Colorimetric Sensing: Predicting Analyte-Dye Interactions with Machine Intelligence and Large Language Models

Authors

Sina Jamalzadegan - Presenter, North Carolina State University
Akhil Penumudy, North Carolina State University (NCSU)
Belinda Mativenga, North Carolina State University (NCSU)
Md Halim Mondol, North Carolina State University (NCSU)
Edgar Lobaton, North Carolina State University (NCSU)
Nelson Vinueza Benitez, North Carolina State University (NCSU)
Coby Schal, North Carolina State University (NCSU)
Qingshan Wei, North Carolina State University
Rapid, cost-effective analyte detection is essential for applications across pharmaceuticals, agriculture, environmental monitoring, and food safety. Colorimetric sensor arrays offer a powerful solution, which classifies different analytes based on collective sensing response instead of specific molecular recognition. However, the development of colorimetric sensor arrays remains an empirical and time-intensive process. To bridge this gap, we harnessed AI and molecular informatics to create a high-quality, open-source database comprising 12,000+ curated data points from 200 published research articles. Our automated large language pipeline extracts key sensing parameters—dyes, analytes, and solvents and translates them into molecular fingerprints and latent space embeddings via Variational Autoencoders (VAE). Leveraging this rich database, we trained machine learning models to accurately predict analyte-dye interactions, and the best-performing model was selected for experimental validation. The results underscore AI’s transformative role in advancing intelligent colorimetric sensing, paving the way for rapid sensor dye screening and automated sensor array generation.