2025 AIChE Annual Meeting
(573b) From Atoms to Materials: Rapid Discovery of High-Performing MOFs for Gas Adsorption Using Minimal Input Information
The goal of this work was to develop a combined genetic algorithm and random forest model to discover novel high performing metal-organic frameworks for gas adsorption. Due to the sheer number of MOF building blocks available, there are a near limitless number of unique MOFs that have been yet to be realized. Given the promising performance of known MOFs, it is highly probably that several of these unrealized MOFs are high-performing methane adsorbers. To accomplish this, we have developed the GARF model, which combines a Genetic Algorithm (GA) with Random Forest (RF) machine learning. By strategically selecting the features used to predict methane adsorption, we can develop an evolutionary algorithm to accurately and efficiently discover high-performing MOFs for methane adsorption.
Using minimal input information, we developed an integrated genetic algorithm random forest machine learning (GARF) model to design and screen high-performing MOFs for gas adsorption. By using a combination of structural, chemical, and crystal descriptors, we were able to predict methane adsorption rapidly and accurately. We trained the RF models on 80% of 50,000 hypothetical MOFs (hMOFs) from the MOFXDB database and tested the models on the remaining 20%. We achieved an R2 value of 0.92 and mean absolute error percentage (MAPE) of 10.2%. In order to intelligently screen hundreds of thousands of MOFs, we implemented a genetic algorithm (GA) which uses the principles of recombination and mutation to evolve solutions to problems. The input information to the GA is encoded in a data structure known as a chromosome. Each chromosome represents a hypothetical MOF and contains only four building blocks and two pieces of crystal information. From this, a chemical formula can be generated from which many chemical properties can be calculated. In addition to adsorption, we also replaced the molecular simulations needed to calculate structural properties with RF machine learning and also achieved high R2 and relatively low MAPE values for each of the six structural properties predicted. By using RF machine learning as the fitness evaluator for the GA, the GARF model is able to screen 250,000 hypothetical MOFs in mere minutes on a personal computer. Even while excluding the top 50 highest performing MOFs in the database from our training set, the GARF model evolved a high-performing MOF equivalent to the eighteenth best MOF out of the 50,000 hMOFs. This finding validates the use of the GARF model and will allow us to expand it to predict adsorption of other gases and other intrinsic properties of MOFs or other reticular materials rapidly and effectively.
We then used the information obtained about the building block chemistries of high-performing MOFs to discover novel materials. We built upon knowledge gained from the results of the previous iterations of the model and literature to identify promising candidate building blocks to add. We added a new metal cluster, organic linker, and functional group to the list of building blocks used as input into the model. GARF was then run and was able to screen for novel high-performing MOFs. The adsorption of high-performing materials was further validated using Grand Canonical Monte Carlo (GCMC) simulations. The GARF model is a valuable tool for accelerating materials discovery.