2024 AIChE Annual Meeting

(350f) Data Science and Machine Learning to Step into the Digital Era of Organic Solvent Nanofiltration

Authors

Ignacz, G. - Presenter, King Abdullah University of Science and Technology
Yang, C., King Abdullah University of Science and Technology
Alqadhi, N., King Abdullah University of Science and Technology
Szekely, G., The University of Manchester
Organic solvent nanofiltration (OSN) is emerging as a promising membrane technology, offering potential energy and operational cost reductions, particularly in the pharmaceutical and fine chemical industry, for recovering solvents, downstream purification, impurity removal and catalysts recovery. Despite its advantages, the industrial adoption of OSN faces challenges due to limited datasets and the absence of predictive models at both membrane and process levels. To address these challenges, there is a need for systematic data aggregation and curation techniques, which would facilitate the development of machine learning models and predictive capabilities [1]. To overcome these challenges, in 2021 we established the OSN Database, an open-access platform to share data and models. Since then, the OSN Database has undergone significant development [2]. With a wealth of over 7000 data points spanning 20 solvents and solvent mixtures, this database serves as a valuable resource for OSN-related data and the advancement of data-driven predictive tools. The OSN Database is freely available at www.osndatabase.com with each datapoint linked directly to the original publication, ensuring transparency and accessibility. Currently, the OSN Database hosts the largest, and most chemically diverse data related to nanofiltration.

The recent progress in numerical optimization models has been significantly boosted by the availability of large datasets and the rapid advancements in computational power, driving major enhancements in process and material design. These developments have facilitated the optimization of hyperdimensional spaces through machine learning and deep learning where analytical solutions are not feasible. Particularly in the field of OSN, we have demonstrated that rational and diversified data aggregation and curation can markedly improve process parameter prediction, even in specialized cases [3,4]. For the first time, we directly use molecular structural information of the solutes, solvents, and membranes in machine learning based rejection prediction downstream tasks. We explore the correlations between rejection and various molecular parameters, revealing the substantial impact of the molecular structure, electronic effects and solvent permeance. As of today, our models have the lowest root mean squared error (0.124 and 0.123 for the internal and literature test sets, respectively) and highest fitting scores for predicting solute rejection in OSN [5].

We will present case studies to show that with clever design, explainable machine learning models can help to design better separation processes. Our developed predictive models are also freely accessible on the OSN Database, fostering a collaborative and inclusive approach to advancing separations in the field of chemical engineering.

[1] Hu, J.; Kim, C.; Halasz, P.; Kim, J. F.; Kim, J.; Szekely, G. Journal of Membrane Science 2021, 619, 118513.

[2] Ignacz, G.; Yang, C.; Szekely, G. Journal of Membrane Science, 2022, 641, 119929.

[3] Ignacz, G.; Szekely, G. Journal of Membrane Science, 2022, 646, 120268

[4] Ignacz, G.; Alqadhi, N.; Szekely, G. Advanced Membranes, 2023, 3, 100061.

[5] Ignacz, G.; Beke, A.K.; Szekely, G. Journal of Membrane Science, 2023, 674, 121519.