2025 AIChE Annual Meeting

(387cm) Advancing Various Aspects of Drug Discovery Using Molecular Simulations and Machine Learning: From Classification to Molecular Probing of Pathologies

Authors

Gül Zerze, Princeton University
Research Interests: using modeling, HPC, simulations, data analysis, deep learning, visualization, and structure prediction to advance drug discovery. The research outcomes from my projects are:
  • Enhancement of molecular simulations using deep learning
  • Development of a new force-field model that can be used to study mesoscale phenomena
  • Classification of human transcription factors using unsupervised learning on proposed feature sets
  • Development of new order parameters to study biomolecular condensation using biased molecular dynamics simulations
  • Studying RNA-protein interactions using molecular dynamics simulations

A summary of my projects is as follows:

1. Exploring the Free-Energy Landscape and Transition Paths for RNA Folding/Unfolding using a Deep Learning-Driven Simulation Technique:

Understanding the process of RNA stem-loop folding and unfolding is crucial in biology. We addressed the computational challenge of simulating RNA stem-loop folding due to the complex folding landscape that requires extensive computation. We adapted DeepDriveMD (DDMD), a deep learning technique, to simulate RNA stem-loop folding dynamics. DDMD adaptively learns a low-dimensional latent representation using an autoencoder from an ensemble of running MD simulations. DDMD then guides the simulations toward the undersampled regions while optimizing the resources to explore the relevant parts of the phase space. The method achieves reasonable free energy landscape prediction at room temperature and identifies relevant slow degrees of freedom in RNA folding. DDMD can simulate RNA folding more efficiently than conventional methods, providing insights into phase space and system kinetics without extensive computational costs.

2. Cholesterol Cluster Formation in Organic Solvents

Cholesterol is an essential component of cell membranes, but high levels of cholesterol in the blood can lead to the formation of cholesterol crystals, which has been found to be preceded by formation of cholesterol clusters. In this study, we aim to study the clustering of cholesterol molecules in ethanol-water mixture (as a biomimetic solvent). We used a mesoscale solvent model developed in our lab to perform coarse-grained molecular dynamics simulations of this system. The simulations were performed at different saturation levels of cholesterol in the solvent mixture to determine the clustering pathways. Through our simulations, we determined that dimer formation precedes the formation of cholesterol clusters in ethanol-water mixture. We also identified the molecular contacts that play the most crucial role in dimer formation.

3. Classification of Human Transcription Factors via Unsupervised Learning

Transcription factors (TFs) are primarily constituted of DNA-binding domains (DBDs) and effector domains, the latter being the portion of the TF that wields influence over the interactions responsible for promoting/inhibiting transcription. A preponderance of studies focuses on DBD-DNA binding that advances our understanding of the structural and functional properties of DBDs, while for effector domains, regardless of them being focally crucial to the process of transcription, there has been a major knowledge gap, especially relating to those associated with Human Transcription Factors (HTFs). This abundance of information on DBDs has resulted in the HTFs being classified into multiple families based on their DBDs, while for effector domains, such classification schemes are nonexistent. We have developed descriptors based on the sequence-dependent properties of the full TF sequences and their effector domains and created a hyperparameter space in order to classify the HTFs based on their effector domains. For this purpose, we employed unsupervised machine learning to classify them into effector domain-based clusters.

4. Effect of RNA Binding on Structured Domains of Intrinsically Disordered Proteins (IDPs)

Protein-RNA interactions have a profound role in the formation of liquid-like biomolecular condensates. There are a plethora of computational studies on such interactions, particularly FUS (an IDP) and its target RNA, but none of those could provide converged results, owing to the large amount of simulation times required because of system sizes. In this project, we used enhanced sampling techniques to generate atomistic insights about the role of RNA’s structure and the RGG domain of FUS in its conformational changes. We have identified the artefacts associated with the use of different force fields and the role of RNA’s structure on the stability of the folded part of FUS.