2025 AIChE Annual Meeting
Machine Learning Models for Catalytically Relevant Transition Metal Carbide Surfaces
Low-index (1–2) surfaces are used as training data for machine-learning (ML) models that predict the energies of higher-index (≥3) facets. Initially we generated a DFT dataset of MoC, WC, VC, and NbC slabs spanning multiple terminations. Structural and elemental descriptors such as the total number of atoms in the surface, cell area, and Bader charge are used as features for the model. Model parameters are extensively optimized, as they heavily influence the model’s prediction performance. Several models were studied, including Support Vector Regression (SVR), Kernel Ridge Regression (KRR), and Random Forest Regression (RFR). From initial tests, the SVR model currently yields an RMSE value of 0.088 eV/atom but still has visible discrepancies. To address this, additional structural and electronic features will be generated in order to better distinguish data points. Model performance will then be systematically assessed using multiple evaluation tools, including parity plots, energy distribution comparisons, and error metrics. Altogether, our initial results indicate that data-efficient ML models trained on low-index surfaces can extend to complex, higher-index TMC facets with reliable accuracy and provide a predictive framework to accelerate the discovery of target surfaces for sustainable catalysis.