2024 AIChE Annual Meeting

(709d) A Computational Approach for Reliable Materials Discovery: Application to Green Refrigerant Discovery

Authors

Maginn, E., University of Notre Dame
This talk describes a computational workflow for the reliable discovery of materials. We describe its application to the discovery of green refrigerants. Currently used refrigerant fluids are deleterious to the environment as they have high global warming potential, necessitating the need for new, green refrigerants. Conventional molecular discovery works involve screening existing databases of molecules like PubChem and assessing their utility experimentally or via semi-predictive engineering correlations. As an alternative, generative machine learning (ML) coupled with ML-based property predictors can be used to obtain novel molecules with desired properties. Mathematical programming in optimization-based frameworks for molecular discovery is also popular. All these approaches have strengths but also some notable weaknesses, which limit their use for de-novo materials discovery tasks.

The novelty of our work lies in developing an integrated approach that systematically draws upon the unique strengths of the methods mentioned above while concurrently mitigating weaknesses. Instead of screening existing public (but limited) databases or using generative ML models (which require enormous amounts of training data), we applied mathematical programming to exhaustively enumerate all feasible candidate molecules subject to well-informed design constraints. We used FineSMILES, a tool we developed for fast and exhaustive enumeration. FineSMILES collects packets of molecular fragments made up of pre-selected elements, screens the packets based on design constraints, and assembles them into SMILES strings. This collection of SMILES strings represents all or almost all feasible and desirable molecules subject to informed design constraints. This approach allows the easy enforcement of highly sophisticated design constraints, which allows for more control and precision in generating high-potential molecules. Applying the FineSMILES code to the refrigerant design problem generated hundreds of thousands of molecules, with more than fifty percent of them not present in the PubChem database. We then leveraged the unique strengths of ML for property prediction to screen generated molecules. We developed and applied ML-based tools for state-of-the-art and independent property predictions with uncertainty quantification for several thermodynamic, environmental, and safety properties of interest. We developed a property prediction approach that used predictions from simple group contribution (GC) models like the Joback GC model and molecular weight as input to Gaussian process (GP) models for accurate property prediction. We also used sigma profiles as a molecular descriptor and vector of input features to GP models for direct property predictions and, as a third approach, for predicting PC-SAFT model parameters for novel molecules. We used all three methods above independently to predict properties. We scored the reliability of predictions for each screened molecule based on the level of agreement between the independently predicted property values using the three methods above. We also considered the GP property prediction uncertainties as additional information for determining prediction reliability. Based on a consideration of thermophysical, environmental, and safety performance, several hundreds of molecules were identified as high-potential green refrigerants, with many of them not previously reported in the open literature. Retrosynthetic accessibility (RA) scores were used for preliminary assessment of the synthesizability of the high-performing molecules. A few tens of molecules were identified with favorable RA scores.

We then used high throughput molecular dynamics (MD) simulations using off-the-shelf interatomic potentials (or force fields) to further validate and assess the technical performance of the identified high-performing and ‘likely synthesizable’ molecules. MD simulations were used to ascertain the miscibility of discovered molecules for designing binary refrigerant mixtures. Relevant thermophysical properties of hundreds of feasible and advantageous binary mixtures across multiple compositions were predicted using high throughput MD simulations. Five of the most promising refrigerant mixtures, none of which have been previously reported and for which we recorded a high level of certainty in predicted performance, were proposed to experimental collaborators for testing.

This work details a novel integration and synergistic use of existing tools and methods to develop a robust computational molecular discovery workflow applied to the green refrigerant design problem. Hundreds of feasible, single-component candidates, not previously reported, were identified. Five novel green refrigerant mixtures were designed and recommended for experimental testing using some of the discovered molecules. The developed approach can be adapted to several other types of problems, especially those involving small molecule discovery and design.