2025 AIChE Annual Meeting

(633b) Predicting Solubility Curves in Solvent Mixtures Using Thermodynamic Cycles and Machine Learning

Authors

Simona Buzzi, KU Leuven
Florence Vermeire, Massachusetts Institute of Technology
William Green, Massachusetts Institute of Technology
Predicting solubility limits of organic compounds in solvent mixtures is crucial in many areas including environmental science, chemical engineering, pharmaceuticals, and biomedical research. Understanding how solutes interact with different solvent combinations and temperatures enables flexibility in tailoring solvents to meet various needs in the drug development process. Advances in machine learning have shown great potential for use in solvent screening tasks, offering an alternative to costly and time-consuming experimental solubility measurements. However, they remain focused on single-solvent systems. In this work, we extend two machine learning methodologies that use the sublimation and fusion cycles to predict solubility in mixed solvents. We use the MolPool function to mix solvents in a permutation-invariant manner. Models that predict solvation energy/enthalpy or activity coefficients are then incorporated into the SolProp or Fusion Cycle methodologies respectively. The methods are then tested and compared on a dataset containing ~29,000 experimental solubility measurements, showing sub 1 log unit RMSEs. Moreover, the use of multiple experimental reference values for a given solute is discussed and shown to enhance prediction.