2023 AIChE Annual Meeting

(663e) Active Learning Workflow for Discovery of Stable Ternary Alloys from Binary Alloy Data

Authors

Wichrowski, N. J., Johns Hopkins University
Evangelou, N., Johns Hopkins University
Ghanekar, P., Purdue University
Deshpande, S., Purdue University
Kevrekidis, I. G., Princeton University
Greeley, J., Purdue University
Graph convolutional networks (GCNs) have been demonstrated to be excellent surrogate models to map catalyst structures to properties. However, determining their fidelity outside the training space—where discovery of catalysts with improved properties is likely—is a challenge. Uncertainty quantification (UQ) techniques that provide an uncertainty estimate associated with a model prediction have been increasingly used for this purpose in a workflow commonly termed as active learning. In this workflow, model predictions are iteratively improved in a region of interest wherein new datapoints are sampled using an acquisition function, which balances both the exploitation of the target property and exploration of input space regions that show high uncertainty.

In this work, we demonstrate the utility of such an active learning workflow in extrapolating predictions of a GCN model trained exclusively on binary alloy properties to ternaries. Specifically, we develop Dropout-based Graph Convolutional Network (dGCN), a GCN model with UQ based on the CGCNN framework. We generate an initial dataset of Pd-Sn, Pt-Sn, and Pd-Pt binary alloy formation energies evaluated using DFT. Next, we generate 16-atom ternary configurations for each composition in the Pd-Pt-Sn ternary space. We derive a novel acquisition function that balances minimization of the formation free energy and maximization of uncertainty, and use it to identify a composition to sample every iteration. Formation energies are evaluated using DFT for configurations in the sampled composition and added to the training set. We repeat this workflow for multiple iterations and show improvement in prediction of ternary alloy formation energies.

We also compare this physics-informed workflow with a data-driven workflow that utilizes Diffusion Maps, a dimensionality reduction technique, to find the intrinsic dimensionality of the latent space and perform active learning in the reduced space. We find that both methods lead to similar improvements, albeit differing in the manner of convergence.