2025 AIChE Annual Meeting
(106e) Transformers for Early-Stage Drug Discovery: A Unified Screening Pipeline
Once trained, the transformer generates general-purpose molecular embeddings, which are reused across multiple downstream tasks. These embeddings are fed into separate, lightweight feed-forward neural networks for the prediction of drug-like properties and ADME-T (absorption, distribution, metabolism, excretion, and toxicity) characteristics. This modular architecture supports both regression tasks, such as predicting lipophilicity, aqueous solubility, volume of distribution, and acute toxicity, and classification tasks, such as blood-brain barrier permeability, CYP450 enzyme inhibition, and mutagenicity. The shared embedding structure ensures consistency across tasks while maintaining high task-specific accuracy. We validate the pipeline using a case study focused on identifying potential inhibitors of HIV Integrase 1, a critical enzyme in the HIV replication cycle. Beginning with a library of over 1.04 million compounds, we apply the pipeline to filter candidates through a comprehensive three-stage process. The first stage involves filtering based on drug-likeness and predicted bioactivity. The second stage evaluates ADME-T properties to remove compounds with suboptimal pharmacokinetic or toxicity profiles. The final stage predicts IC50 values and calculates binding efficiency indices to rank candidates by potency and molecular efficiency. This process ultimately narrows the initial library to just 143 highly promising molecules, each of which demonstrates strong therapeutic potential and favorable pharmacological characteristics.
Our model achieves high performance across all tasks, with regression models reaching R² values above 0.96 and classification tasks yielding precision, recall, and F1 scores consistently above 0.97. Importantly, the modular design of the pipeline allows for easy adaptation to new properties, therapeutic targets, or regulatory criteria by simply retraining the downstream prediction layers using the same transformer-generated embeddings. This adaptability is particularly valuable in real-world drug discovery, where project goals and constraints often vary widely. In summary, this work introduces a scalable, accurate, and adaptable transformer-based screening pipeline that consolidates early-stage drug discovery into a single, cohesive framework. By replacing fragmented, descriptor-driven processes with a unified model that learns directly from molecular sequences, our approach accelerates the identification of viable drug candidates, reduces development costs, and significantly mitigates the risk of late-stage failures. The framework sets a new benchmark for how deep learning and attention-based models can transform the front end of drug development pipelines.
References:
[1] J. Drews, Drug Discovery: A Historical Perspective, Science (1979) 287 (2000) 1960–1964. https://doi.org/10.1126/science.287.5460.1960.
[2] G. Sliwoski, S. Kothiwale, J. Meiler, E.W. Lowe, Computational Methods in Drug Discovery, Pharmacol Rev 66 (2014) 334–395. https://doi.org/10.1124/pr.112.007336.
[3] A. Khambhawala, C.H. Lee, S. Pahari, P. Nancarrow, N.A. Jabbar, M.M. El-Halwagi, J.S.-I. Kwon, Advanced transformer models for structure-property relationship predictions of ionic liquid melting points, Chemical Engineering Journal 503 (2025) 158578. https://doi.org/https://doi.org/10.1016/j.cej.2024.158578.
[4] A. Khambhawala, C.H. Lee, S. Pahari, J.S.-I. Kwon, Minimizing late-stage failure in drug development with transformer models: Enhancing drug screening and pharmacokinetic predictions, Chemical Engineering Journal (2025) 160423. https://doi.org/https://doi.org/10.1016/j.cej.2025.160423.