2025 AIChE Annual Meeting

(573d) Adaptive allocation of Monte Carlo samples for efficient, multi-fidelity computational screening of metal-organic frameworks

Authors

N. Scott Bobbitt, Northwestern University
Jana Doppa, Washington State University
Huazheng Wang, Oregon State University
Cory Simon, Oregon State University
For applications in gas sensing, purification, and capture, we often wish to search a large set of metal-organic frameworks (MOFs) for the top-K in terms of their Henry coefficient of an adsorbate. Computing each MOF’s Henry coefficient typically requires costly Monte Carlo simulations, where each sample inserts an adsorbate in the MOF at a random position and orientation to calculate the MOF-adsorbate interaction energy.


We frame this task as a top-K arm identification problem in the multi-armed bandit setting of classic reinforcement learning, sequentially and adaptively allocating adsorbate-insertions among the MOFs in a data-driven manner to obtain the most accurate top-K subset under a fixed insertion budget. Treating each MOF as a slot machine in a casino, each adaptive allocation algorithm (1) proceeds in a feedback loop of (i) allocate adsorbate-insertions to MOF(s), (ii) update the running estimates of the Henry coefficients of the MOF(s), then (iii) judiciously allocate adsorbate insertions to the next MOF(s); (2) sequentially dials-up the fidelities of ongoing molecular simulations in the MOFs, giving a multi-fidelity computational screening; (3) circumvents the need for hand-crafting structural or chemical features of the MOFs.


As a case study, we implement, benchmark, and analyze the sequential halving, successive accepts and rejects, and narrowing exploration (our proposed heuristic) algorithms to adaptively allocate xenon insertions to screen a set of ca. 300 MOFs for the top-K Xe Henry coefficient subset over differing insertion budgets. Provided with a sufficient budget, we find that these adaptive insertion algorithms can significantly reduce (by a factor of two to three) the simple regret (sum of true minus empirical top-K true Henry coefficients) and top-K identification error, and provide a 35% discount on the computational cost to identify the top-K MOFs with less than 5% error. We thereby demonstrate that top-K arm identification algorithms may generally be useful for more efficiently screening materials for various properties via Monte Carlo molecular simulations and enabling the adoption of more sophisticated force fields or even ab initio calculations for the potential energy of configurations to lend higher-fidelity screenings.