2025 AIChE Annual Meeting

(207b) Decoding Alphafold Pair Representations for Pharmaceutical Applications

Authors

Vishruth Devan, Columbia University
Venkat Suprabath Bitra, Columbia University
Venkat Venkatasubramanian, Columbia University
Physical experiments that map molecular interactions remain resource-intensive and con-
tribute to bottlenecks in drug discovery. AlphaFold2 revolutionized protein structure predic-
tion, but its intermediate pair representations, the 128-dimensional embeddings that capture
residue pair relationships derived from evolutionary sequence alignments, have not yet been
explored to their full potential. We propose a framework to leverage these representations in
three ways. First, we decomposed the attention maps to infer biophysical interaction types,
such as hydrogen bonds or π - π stacking. Then, we performed a Zipfian analysis, inspired by
the common pattern in natural language that the frequency of a word is inversely propor-
tional to its rank in a frequency table. This analysis allowed us to uncover “chemical grammar
rules” that were conserved in the AlphaFold2 pipeline. Finally, we propose to combine these
learned interaction features with chemical domain knowledge to build a hybrid classifier,
capable of predicting interaction profiles between target proteins using attention patterns
between target proteins. This classifier leverages the established links between evolutionary
sequence patterns and physicochemical interaction landscapes, and has several potential ap-
plications, including target-specific therapeutic candidate screening and drug repurposing.
This framework highlights how we can combine insights from data with domain knowledge
for interpretable, efficient drug discovery.