2024 AIChE Annual Meeting

(196g) Autonomous Faithful Retrosynthesis with Large Language Models: From Synthesis Planning to Experimental Procedures

Authors

Liu, X. - Presenter, Xi'an Jiaotong University
Chiu, S., University of Illinois Urbana-Champaign
Zhao, H., University of Illinois-Urbana
Recent advancements in Large Language Models (LLMs) have facilitated their application in autonomous experimental design and execution within the field of chemistry research [1,2]. However, a significant limitation arises from their dependency on existing retrosynthesis tools, which often fail to provide accurate references, or their incapacity to search for synthesis information of novel target molecules on the Internet. Furthermore, ensuring the faithfulness of generated content remains a paramount concern in the scientific domain, as unfaithful generations often result in the inclusion of implausible reactions or inappropriate reaction conditions. The exploration of LLMs' capabilities for reliable and faithful retrosynthesis, particularly regarding reaction templates and experimental procedures, is still undiscovered.

Herein, we introduce a novel LLM-based intelligent agent named FaithRetro, designed to accurately predict precursors, deduce reaction conditions, and draft experimental procedures. FaithRetro leverages its capabilities to understand reaction templates, navigate through local reaction databases, and assess experimental procedures, thereby facilitating the generation of reliable retrosynthesis routes and experimental protocols. The efficacy and adaptability of FaithRetro were evaluated across various tasks, including precursor prediction for target molecules, efficient retrieval of relevant experimental procedures, and the assessment and prioritization of experimental procedures based on criteria such as action complexity, reagent toxicity, and reaction duration. The results demonstrate FaithRetro’s capacity to innovate upon existing procedures based on user-defined criteria. Lastly, FaithRetro is asked to autonomously predict precursors and generate practical experimental procedures with reference for a prompt including a target molecule and given procedure constraints. The findings suggest that FaithRetro as an intelligent system significantly broadens the applicability of LLMs in retrosynthesis, markedly reducing human intervention and simplifying the process of content verification by ensuring the prediction of faithful content with reference.

[1] Boiko, Daniil A., et al. "Autonomous chemical research with large language models." Nature 624.7992 (2023): 570-578.

[2] Bran, Andres M., et al. "Chemcrow: Augmenting large-language models with chemistry tools." arXiv preprint arXiv:2304.05376 (2023).