Breadcrumb
- Home
- Publications
- Proceedings
- 2025 AIChE Annual Meeting
- Engineering Sciences and Fundamentals
- Faculty Candidates in CoMSEF/Area 1a, Session 1
- (66l) Towards a Comprehensive Reaction Database for Organometallics
To address this critical gap, our group is developing a comprehensive database of transition metal reactions by employing two complementary methodologies. First, we utilize the computational platform, Yet Another Reaction Program (YARP),3 to systematically explore reaction networks through bond-forming and bond-breaking events. YARP enables the automated enumeration of reaction products, the prediction of reaction barriers, and the identification of reaction intermediates, thereby generating extensive reaction graphs. This computationally driven approach enables high-throughput screening and systematic investigation of reaction spaces beyond traditional experimental limitations.
In parallel, we are harnessing the capabilities of advanced large-language model (LLM) agents and natural language processing (NLP) to extract and curate reaction data directly from the extensive existing literature. After initial extraction, these literature-identified reactions are subjected to computational refinement and rigorous transition state calculations to ensure accuracy and reliability. By integrating both computational enumeration and literature-based methodologies, we are creating a robust and highly comprehensive transition metal reaction database, structurally analogous to the widely utilized Reaction Graph Depth 1 (RGD1) dataset4 for organic molecules. This combined approach offers unprecedented depth and breadth for exploring transition metal reactivity, significantly enhancing predictive capabilities and accelerating catalyst design in both homogeneous and heterogeneous catalysis.
References:
(1) Balcells, D.; Skjelstad, B. B. tmQM Dataset—Quantum Geometries and Properties of 86k Transition Metal Complexes. J. Chem. Inf. Model. 2020, 60 (12), 6135–6146. https://doi.org/10.1021/acs.jcim.0c01041.
(2) Kevlishvili, I.; Michel, R. G. S.; Garrison, A. G.; Toney, J. W.; Adamji, H.; Jia, H.; Román-Leshkov, Y.; Kulik, H. J. Leveraging Natural Language Processing to Curate the tmCAT, tmPHOTO, tmBIO, and tmSCO Datasets of Functional Transition Metal Complexes. Faraday Discuss. 2025, 256 (0), 275–303. https://doi.org/10.1039/D4FD00087K.
(3) Zhao, Q.; Savoie, B. M. Simultaneously Improving Reaction Coverage and Computational Cost in Automated Reaction Prediction Tasks. Nat Comput Sci 2021, 1 (7), 479–490. https://doi.org/10.1038/s43588-021-00101-3.
(4) Zhao, Q.; Vaddadi, S. M.; Woulfe, M.; Ogunfowora, L. A.; Garimella, S. S.; Isayev, O.; Savoie, B. M. Comprehensive Exploration of Graphically Defined Reaction Spaces. Sci Data 2023, 10 (1), 145. https://doi.org/10.1038/s41597-023-02043-z.