2025 AIChE Annual Meeting

(9a) Interpretable Visualizations of Data Spaces for Classification Problems

Checkout Do you already own this? Log in to access this content.

Pricing

Individuals

AIChE Pro Members	150.00
AIChE Emeritus Members	105.00
AIChE Graduate Student Members	Free
AIChE Undergraduate Student Members	Free
AIChE Explorer Members	225.00
Non-Members	225.00

Author

Rose Cersonsky - Presenter, EPFL STI IMX COSMO

In recent years, machine learning (ML) methods have transformed computational chemistry and materials research. In ML algorithms, we rely on machine-learning representations to serve as a “mathematical proxy” for our underlying chemistry. Molecular featurization—how we transform atoms and molecules into mathematical signals appropriate for machine-learning thermodynamic quantities—has an important role in our ability to learn material properties and observable quantities. There are many ways to encode raw chemical data, including the popular SMILES strings, symmetrized correlation functions, or determining implicit representations through deep model architectures. Unfortunately, while these representations have demonstrated unparalleled success in predictive modeling, their high-dimensionality often makes it difficult to extract meaningful scientific hypotheses or conclusions from their performance.

In this talk, I will primarily focus on how we assess and interpret models built on such molecular representations, focusing on how to extract actionable chemical and physical principles from models built on chemical data, a task traditionally achieved through unsupervised analyses such as principal components analysis or t-stochastic neighborhood embeddings. However, these methods only ask, “What makes these data points similar?” not “In what ways does my model see these points as similar?” The latter question, particularly in the context of supervised ML models, is more powerful and informative for structure-property relationships. Our results show that this multi-objective framing, with its inherent interpretability, reveals underlying trends across many ML tasks, from materials classification to machine-learning potential building to non-linear regression tasks.

Breadcrumb

2025 AIChE Annual Meeting

(9a) Interpretable Visualizations of Data Spaces for Classification Problems

Author