2025 AIChE Annual Meeting

(229e) Open DAC 2025: An Updated Dataset for Sorbent Discovery in Direct Air Capture

Authors

Logan M. Brabson - Presenter, Georgia Institute of Technology
Anuroop Sriram, Facebook AI Research
Xiaohan Yu, The Dow Chemical Company
Zachary Ulissi, Carnegie Mellon University
Andrew J. Medford, Georgia Institute of Technology
David S. Sholl, Oak Ridge National Laboratory
Computer simulations are a useful tool for exploring the vast chemical space of solid adsorbents relevant for direct air capture (DAC). Many existing datasets of simulated data for metal-organic frameworks (MOFs) are available, but they vary in size, level of theory, and whether the materials have been experimentally synthesized. This work presents the Open DAC 2025 (ODAC25) dataset, an open-source dataset of density functional theory (DFT) calculations of small molecule adsorption relevant for DAC from humid air in MOFs. Expanding on the Open DAC 2023 dataset of more than 38 million single point DFT calculations in experimentally synthesized MOFs, the updated dataset includes N2 and O2 adsorption simulations and ~103 new MOFs featuring nucleophilic functionalization of organic linkers and open metal sites. New machine learning force fields (MLFFs) trained on ODAC25 using the Equiformer and GemNet architectures are also presented. ODAC25 is the largest dataset of small molecule adsorption in MOFs using quantum chemical simulations. The dataset can serve as a platform for future studies of chemical motifs beneficial for DAC and as a useful starting point for training machine learning models to accelerate materials discovery for adsorptive separations in MOFs.