2023 AIChE Annual Meeting
(197au) Multi-Fidelity Deep Learning for Data-Efficient Molecular Property Models from Experimental and Computational Data
Authors
To address these questions, we present a comprehensive benchmark of multi-fidelity methods. We systematically add noise and bias to a synthetic dataset and split high- and low-fidelity data in ways that mimic realistic use cases. We also evaluate the multi-fidelity methods on several real-world datasets of optical properties (experiments and time-dependent density functional theory calculations), solubility (experiments and COSMO-RS calculations), and drug efficacy/potency (single-dose and dose-response measurements). We compare the multi-fidelity model performance to transfer learning and Î-machine learning and provide recommendations for best practices in training models when multiple levels of fidelity are available. Finally, we demonstrate the application of uncertainty quantification and active learning in these models. The more thorough understanding of multi-fidelity methods we develop in this work will allow for more data-efficient molecular and materials design.
References:
[1]: Vermeire, Florence H., and William H. Green. "Transfer learning for solvation free energies: From quantum chemistry to experiments." Chemical Engineering Journal 418 (2021): 129307.
[2]: Ramakrishnan, Raghunathan, et al. "Big data meets quantum chemistry approximations: the Î-machine learning approach." Journal of chemical theory and computation 11.5 (2015): 2087-2096.
[3]: Buterez, David, et al. "Multi-fidelity machine learning models for improved high-throughput screening predictions." ChemRxiv (2022).