2025 AIChE Annual Meeting

(393k) DFT and ML Based Property Prediction of Metal Complex Photosensitizers for Photodynamic Therapy

Authors

Yachao Dong - Presenter, Dalian University of Technology
Jingxing Gao, Dalian university of technology
Ran Wang, Dalian university of technology
Wen Sun, Dalian university of technology
Jian Du, Dalian University of Technology
Photodynamic therapy is a noninvasive clinical treatment for cancers using photosensitizers and light. While most research has focused on organic molecules such as porphyrins as photosensitizers, there is emerging interest in the utilization of transition metal complexes, which can display intense absorptions in the visible region, and many also possess high two-photon absorption cross-sections, enabling two-photon excitation with NIR light. Photosensitizer synthesis and the following performance test are time and resource consuming, so pre-synthetic screening of photosensitizers for their property would be critical. Machine learning (ML) method has gained popularity and proved to be a powerful tool in various areas of sciences and technologies, which uses algorithms to learn from data, detect patterns, and make fast and accurate prediction. However, traditional structure descriptors such as SMILES and deep learning models such as graph neural network are not suitable for transition metal complex photosensitizers, because the lack of sufficient data make it inapplicable to high-dimensional SMILES descriptors and deep learning models. Thus, it is critical to develop low-dimensional machine learning models, which are applicable to small datasets of transition metal complex photosensitizers.

In this presentation, a hybrid mechanistic and data-driven model is proposed for the quantitative structure-property relationship (QSPR) of photosensitizers. Important excited-state quantum-chemical descriptors (QCD) are first calculated based on density functional theory (DFT), since these QCD can describe the mechanism of the type II PDT process through the differences in electron density among the different excited states. These descriptors and other three kinds of descriptors, including metal-centered descriptors (MCD) describing the impact of radius, oxidation state and outer electron configuration of different metal center; molecule structure descriptors (MSD) describing the impact of molecular size and different functional groups; external condition descriptors (ECD) describing the impact of solvents and excitation wavelengths, are used to build different machine learning (including LASSO, support vector regression, kernel ridge regression, random forest regression and XGBoost) models. These models are tested on the singlet oxygen quantum yield (which is more important as an evaluation index of type II photosensitizers for PDT) prediction of hexa-coordinate transition metal complex (such as Ru-complex, Ir-complex and Re-complex) photosensitizers respectively.

Subsequent comparison of different combinations of descriptors (MCD+QCD+ECD and MCD+MSD+ECD) as model input confirm the impact of QCD which describes the excited state properties on singlet oxygen quantum yield. Support vector regression model and kernel ridge regression model also shows good generalization ability on external test set while XGBoost model shows a little overfitting. Finally, we confirm support vector regression model and kernel ridge regression model with all four kinds of descriptors are the best model on the metal complex photosensitizers property prediction out of the studied models. These two low dimensional machine learning models could be a useful method aiding experimental research in pre-synthetic screening of hexa-coordinate photosensitizers. Other transition metal complexes, such as Metal porphyrin complexes, may also be included in the ML training set for future research.