2023 AIChE Annual Meeting

(199g) Integrating Experimental Data into Molecular Simulations: A Hybrid Modeling Approach for Bridging Theory and Experiment

Authors

Pahari, S. - Presenter, TEXAS A&M UNIVERSITY
Kwon, J., Texas A&M University
Molecular simulations play a significant role in various fields, such as designing new materials, navigating drug discovery, discovering reaction pathways in catalytic systems, and developing quantitative structure and property relationships in chemical species [1]. However, despite the significant advancements in molecular simulation techniques like molecular dynamics (MD) and kinetic Monte Carlo (kMC) simulations, these methods fail to explain some essential observations [2,3]. This is mainly because certain molecular-scale mechanisms, involving interactions or kinetics, are not yet understood and accurately modeled mathematically. The empirical nature of force-field parameters in MD simulations and rate constants in kMC simulations contributes to the uncertainty associated with these models, making it challenging to use them to explain experimental observations [3, 4].

Researchers have attempted to address these challenges by developing force field parameterization schemes, such as deriving conventional parameters for MD simulations from quantum mechanical calculations [5]. Similarly, activation energy barriers for kMC simulations are calculated directly from free energy diagrams obtained from quantum mechanical calculations [6]. However, these models fail to explain many experimental results due to underlying approximations in the quantum mechanical models they are based on. Machine learning techniques, like artificial neural networks, have also been used to identify the optimal model parameters by utilizing data from multiple sources [7, 8]. However, these approaches have limitations, and the parameters determined for a specific class of molecules cannot be extended to others.

To overcome these limitations, researchers have developed a hybrid modeling framework that combines black box models like neural networks with first-principles models like physics-based molecular models. Hybrid models provide higher accuracy than first-principles models and are more robust than black box models, such as neural networks, while also having strong extrapolation capabilities. In hybrid models, the interpretability of first-principles models is preserved, allowing important inferences to be made from simulation results. The hybrid modeling framework involves performing a sensitivity analysis of model parameters, such as force field parameters for MD simulations and rate constants for kMC simulations. Since analytical expressions for sensitivity cannot be obtained for MD and the kMC simulations, a data-driven sensitivity analysis with polynomial chaos expansion is performed to identify the most sensitive parameters. The equations of the first-principles model are then modified using a deep neural network (DNN), which predicts the most sensitive parameters as a function of model outputs from the previous timestep and model inputs. The neural network is trained using input and output data generated with varying simulation conditions such as pH, temperature, pressure, and species concentration. The loss function used for training is the mean squared error between the outputs of the modified first-principles model and experimental data.

The hybrid modeling framework is applied to two separate case studies: (1) catalysis of Nitrogen reduction reaction (NRR) on Ruthenium catalysts via kMC simulations, and (2) self-assembly of surfactant molecules into complex nanostructures called dynamic binary complexes using coarse-grained MD simulations. These case studies demonstrate the capability of the hybrid modeling framework to integrate experimental results into molecular simulations, providing more accurate and reliable results.

References.

[1] Frenkel, D., & Smit, B. (2001). Understanding molecular simulation: from algorithms to applications (Vol. 1). Elsevier.

[2] Van Gunsteren, W. F., & Berendsen, H. J. (1990). Computer simulation of molecular dynamics: methodology, applications, and perspectives in chemistry. Angewandte Chemie International Edition in English, 29(9), 992-1023.

[3] Fröhlking, T., Bernetti, M., Calonaci, N., & Bussi, G. (2020). Toward empirical force fields that match experimental observables. The Journal of chemical physics, 152(23), 230902.

[4] Gkeka, P., Stoltz, G., Barati Farimani, A., Belkacemi, Z., Ceriotti, M., Chodera, J. D., & Lelièvre, T. (2020). Machine learning force fields and coarse-grained variables in molecular dynamics: application to materials and biological systems. Journal of chemical theory and computation, 16(8), 4757-4775.

[5] Behler, J., & Parrinello, M. (2007). Generalized neural-network representation of high-dimensional potential-energy surfaces. Physical review letters, 98(14), 146401

[6] Allen, A. E., Robertson, M. J., Payne, M. C., & Cole, D. J. (2019). Development and validation of the quantum mechanical bespoke protein force field. ACS omega, 4(11), 14537-14550.

[7] Thaler, S., & Zavadlav, J. (2021). Learning neural network potentials from experimental data via Differentiable Trajectory Reweighting. Nature communications, 12(1), 6884

[8] Hermann, M. R., & Hub, J. S. (2019). SAXS-restrained ensemble simulations of intrinsically disordered proteins with commitment to the principle of maximum entropy. Journal of chemical theory and computation, 15(9), 5103-5115.