Machine learning (ML) potentials, particularly graph neural networks (GNN), have emerged as efficient alternatives to density functional theory (DFT) calculations for catalyst screening. Recent efforts in developing large DFT datasets led to many GNN architectures being devised. However, generalization to specialized tasks remains poor, as generating new data remains costly and often less effective than other screening approaches. In this regard, transfer learning (TL), i.e., fine-tuning the weights of a pre-trained model to new samples or tasks, has proven successful in leveraging small datasets and achieving outstanding results in fields like computer vision. In this work, TL is leveraged to accelerate the screening of single-atom alloys (SAAs) using a small dataset (less than 1k data points). SAAs, comprised of isolated promoter atoms dispersed on the surface of a selective host metal, were selected given their relevance for multiple catalytic processes, including selective hydrogenation, oxidation, and electrocatalysis
1
Models for the direct prediction of adsorption energies from initial geometry configurations, pre-trained with the OC2020 dataset2 (over 600k DFT relaxations for training), were chosen for TL. Several combinations of coinage metal hosts (Cu, Au, Ag), transition metal promoters, and adsorbates were chosen. In all cases, the optb86bvdw exchange-correlation functional was considered, which accounts for dispersion forces not considered in the original dataset. Varying extents of fine-tuning (i.e., freezing different blocks during training) are investigated, achieving MAE values on adsorption predictions below 0.2 eV, aligned with state-of-the-art models. This work highlights the effectiveness of TL extending foundational catalysis models to small datasets, often accessible on computational chemistry repositories, extending the use of ML in high-throughput screening.
References
[1] El Berch, John N., et al. "Advances in Simulating Dilute Alloy Nanoparticles for Catalysis." Nanoscale 17.4 (2025): 1936-53.
[2] Chanussot, Lowik, et al. "Open Catalyst 2020 (Oc20) Dataset and Community Challenges." ACS Catalysis 11.10 (2021): 6059-72.
