2025 AIChE Annual Meeting

(183g) Transfer Learning from Large Language Models for Biology Predicts an Antibiotic Resistance Phenotype from Genotype

Authors

Dongheon Lee - Presenter, Duke University
Emrah ?im?ek, Duke University
Lingchong You, Department of Molecular Genetics and Microbiology
β-lactam antibiotics are the most prescribed first-line treatment for patients who have bacterial infections. However, due to the evolutionary pressure, bacteria have acquired the resistance to β-lactams though expressing β-lactamases. Such antibiotic resistance has become increasingly prevalent, which causes a steady increase in mortality due to bacterial infection globally [1]. On the other hand, the approval rate of new antibiotics has decreased due to economical and scientific reasons [2]. Therefore, it is important to precisely understand how bacteria carrying β-lactamases behaves under the antibiotic treatment, which will optimize the treatment efficiency. Motivated by this, we propose a transfer-learning framework that can predict the phenotype of bacteria resistant to β-lactams. In line with our previous study, we will quantify the antibiotic resistance phenotype of bacteria by the term called private benefit [3]. This term refers to the extent to which bacteria carrying β-lactamases are protected by the enzymes they produce. This study will aim to construct a machine learning (ML) model that can predict this phenotype if we know the sequences of β-lactamase enzymes expressed from bacteria and molecular structure of β-lactams. To improve the generalizability of the developed ML model, we will leverage biomolecular large language models, ESM-2 [4] and Molformer [5], to represent the enzymes and small molecules. Lastly, we will build a ML model that can map from the latent representations of the β-lactamase and β-lactam to the bacteria phenotype under antibiotic resistance. After the successful training, we will construct a large library of β-lactamases, each of which will be expressed by E. coli to measure its private benefit phenotype. The acquired measurements will validate and further refine, if necessary, the prediction accuracy of the model.

1. V. Lee, "The antibiotic resistance crisis: part 1: causes and threats," Pharmacy and therapeutics, vol. 40, no. 4, p. 277, 2015.

2. L. J. Piddock, "The crisis of no new antibiotics—what is the way forward?," The Lancet infectious diseases, vol. 12, no. 3, pp. 249-253, 2012.

3. H. Ma, H. Xu, K. Kim, D. Anderson and L. You, "Private benefit of β-lactamase dictates selection dynamics of combination antibiotic treatment," Nature Communications, vol. 15, p. 8337, 2024.

4. Z. Lin, H. Akin, R. Rao, B. Hie, Z. Zhu, W. Lu, N. Smetanin, R. Verkuil, O. Kabeli, Y. Shmueli and A. dos Santos Costa, "Evolutionary-scale prediction of atomic-level protein structure with a language model," Science, vol. 379, no. 6637, pp. 1123-113, 2023.

5. J. Ross, B. Belgodere, V. Chenthamarakshan, I. Padhi, Y. Mroueh and P. Das, "Large-scale chemical language representations capture molecular structure and properties," Nature Machine Intelligence, vol. 4, no. 12, pp. 1256-1264, 2022.