2024 AIChE Annual Meeting
(169c) Prediction of pKa in Different Solvents Via Deep Learning
Authors
The difference in pKa of a compound between two solvents can be calculated using solvation models. These calculations can be used with aqueous pKa data, as well as an “anchor” acidity value in the desired solvent, to compute pKa in nonaqueous solvents. Previously, we have shown that this method can be used with the COSMO-RS solvation model to compute dissociation constants with mean absolute errors (MAEs) less than 1 log unit in several solvents, with good performance for solutes including small molecules such as amino acids, and neurotransmitter derivative molecules.
In this work, we leverage this method to develop a dataset of computed pKa values. We introduce previously-unpublished experimental data, including datasets that have recently been, or are in the process of being, critically reviewed by IUPAC. We combine these synthetic and experimental datasets to develop a large corpus of training data, which is used to train a directed message-passing neural network (D-MPNN) model. Finally, we evaluate the model performance, comparing against the performance of other nonaqueous pKa models.