2025 AIChE Annual Meeting

(642g) Developing a Probabilistic Backmapping Framework to Enable Multiscale Simulation of Aqueous Solutions

Multiscale modeling is essential for simulating complex materials and soft materials systems, allowing for efficient exploration of large-scale phenomena while retaining chemical properties and details, helping us understand and predict the behavior of aqueous systems. To recover this detail without sacrificing computational efficiency, we focus on backmapping - reconstructing atomistic configurations from coarse-grain (CG) representations. In this regard, water presents an ideal starting point. As one of the simplest yet most essential molecular systems, water combines well-characterized properties with anomalous behavior, posing a manageable yet meaningful challenge for backmapping. While traditional backmapping approaches are typically deterministic, they often fail to capture the inherent configurational diversity and orientational complexity of water molecules. Recent developments in machine learning, however, enable learning of complex, high-dimensional probability densities. In this work, we develop a probabilistic backmapping framework for rigid, fixed-point charge water models trained to predict local orientational distributions from CG inputs using a variational autoencoder (VAE) architecture. This framework operates in a novel local orientational coordinate system and uses normalizing flows with geometric algebra attention to learn the conditional probability distributions of molecular orientations and positions. This setup allows us to generate atomistic water structures that are consistent with the underlying CG model. We validate our method using several CG water models, checking that backmapped configurations accurately reproduce important structural properties, such as radial distribution functions and hydrogen bond statistics. Beyond structural recovery, the probabilistic nature of our model allows for additional sampling strategies, such as reweighting and Monte Carlo moves. Overall, this data-driven approach provides a computationally efficient and physically consistent way to restore atomistic detail, helping improve the accuracy of thermodynamic property predictions in multiscale aqueous simulations.