2025 AIChE Annual Meeting

(93h) Knowledge-Graph-Powered Interpretable AI for Drug Safety Monitoring

Authors

Crystal Su, Columbia University
Sally Liu, Columbia University
Julie Cheng, Columbia University
Jongho Bae, Columbia University
Venkat Venkatasubramanian, Columbia University
Schema-based Unsupervised Semantic Information Extraction (SUSIE), developed by our group, is a pharmaceutical information extraction tool that converts unstructured natural language documents into knowledge graphs. SUSIE is a hybrid AI model combining data-driven and symbolic AI tools, including ontologies, which are formal representations of domain knowledge. This hybrid framework addresses key limitations of large language models (LLMs) in scientific and engineering domains, namely their opaque decision-making process and susceptibility to hallucinations. Using SUSIE, we generated knowledge graphs from publicly available databases related to different aspects drug safety, encompassing unexpected events during manufacturing and drug usage. SUSIE maps reports from disjoint sources, including drug interactions and adverse events, to existing ontologies, including our custom-developed Columbia Ontology for Pharmaceutical Engineering. We represent these graphs quantitatively through ensembles of knowledge graph embedding models, such as TransE and RotatE, selected based on their accuracy. This representation also mitigates the ambiguities of texts in natural language, and thus serves as a structured input for Graph Neural Networks that can detect latent drug-safety signals. As such, this text-to-knowledge-graph pipeline can serve as an explainable early warning system. This interpretable approach can support decision-making and serves as an example of how domain-aware, agentic AI can assist scientists in safety applications.