Rational solvent selection guided by machine learning and molecular descriptors in asymmetric catalytic reactions
Yehia Amar,a Artur M. Schweidtmann,bLiwei Cao,a,d Paul Deutschc and Alexei Lapkin a,d
a Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge, CB3 0AS, United Kingdom
b Aachener Verfahrenstechnik â Process Systems Engineering, RWTH Aachen University, Aachen, Germany
c UCB Pharma S.A. Allée de la Recherche, 60 1070 Brussels, Belgium
d Cambridge Centre for Advanced Research and Education in Singapore Ltd, 1 Create Way, CREATE Tower #05-05, 138602, Singapore
Email: aal35@cam.ac.uk
Abstract
Rational solvent selection remains a significant challenge in process development, especially within pharmaceutical applications. Therefore, a hybrid mechanistic - machine learning approach, geared towards automated process development work-flow was developed, and successfully applied on a Rh(CO)2(acac)/Josiphos (R1=cyclohexyl, R2=4-methoxy,3,5-dimethylphenyl) catalyzed asymmetric hydrogenation of a chiral α-β unsaturated γ-lactam reaction. The mechanistic part of the model is based on molecular descriptors of physico-chemical properties of solvents, including the reaction-specific descriptors, substrate and hydrogen solubility. A library of 400 solvents was used with 17 molecular descriptors.
The algorithm, which is based on a Gaussian process surrogate model, was trained to learn and optimize for both conversion and diastereomeric excess simultaneously, ultimately identifying better solvents.
In addition to being a powerful design of experiments methodology, the resulting statistical surrogate model is predictive, with a cross-validation correlation coefficient of 0.83. Furthermore, a solvent-mixing strategy based on the black box approach was also investigated. These methods open the door for process chemists to use enhanced process development workflows for optimization and discovery.