2025 AIChE Annual Meeting

A Probabilistic Approach to Multiscale Simulation: Evaluating Machine-Learned Backmapping Models of Diphenylalanine

Diphenylalanine (FF) has demonstrated the tendency to form several nanostructures, a subset of which are electrically conductive and have exciting potential applications in biotechnology and nanoelectronics. However, the exact mechanisms underlying this self-assembly process remain largely ambiguous. Atomistic-scale molecular dynamics (MD) simulations allow for the evaluation of atomic-level interactions crucial in understanding the driving thermodynamic forces behind self-assembly, but are too computationally expensive to span the necessary timescale to observe the self-assembly process in action. Coarse-grained (CG) models allow for simulation at the appropriate timescale but come at the cost of vital atomistic detail. It is therefore important to develop methods of switching between atomistic and CG resolution efficiently to probe the necessary characteristics while maintaining computational efficiency. While methods of scaling down to the CG space from the atomistic are well established, scaling up from the CG space to the atomistic (backmapping) remains a significant challenge. While deterministic backmapping methods have been thoroughly studied, they fail to capture the inherent variability present in a CG structure. In our work, we incorporate machine-learning methods, specifically a probabilistic backmapping approach, to better represent this variability. Our backmappings reintroduce the positions of the 43 atoms in FF from an eight-bead representation utilized by the MARTINI CG force field. Separate models were trained in internal bond-angle-torsion (BAT) coordinates for each unique CG bead. Sampling and analysis of the learned probability distributions demonstrate that the models very accurately replicate diphenylalanine’s atomistic structure. Evaluation of the potential energies of backmapped structures reveals that the model predicts physically viable, low-energy conformations in the majority of cases. The compartmentalized training procedure additionally allows for a transition to more complex systems, such as long-chain polymers and proteins, with relative ease.