2025 AIChE Annual Meeting

(395d) Development of a Generative AI Tool for the Concept Warehouse

Authors

Namrata Shivagunde, University of Massachusetts Lowell
Anna Rumshisky, University of Massachusetts Lowell
Milo Koretsky, Oregon State University
Generative Artificial Intelligence (GenAI) tools pose challenges and potential promise in learning chemical engineering. Educators have reported using GenAI to assist students (e.g., real-time tutoring, feedback, etc.), instructors (e.g., adaptive lesson design, grading, etc.), and researchers (e.g., automated qualitative coding and analysis). However, there is a continued need to develop pedagogically appropriate, discipline-specific GenAI systems attuned to existing educational ecosystems. In this poster, we outline the work done to train open-source large language models (LLMs) to analyze short-answer justifications to concept questions written by students in thermodynamics and mechanics courses and the development of a generative AI tool for the Concept Warehouse (CW).

The CW is a free, web-based active learning tool that serves as an audience response system and content repository. There are 40,000+ students and 1700+ faculty utilizing the CW with a repository of 3600+ questions. In this study, instructors are asked to deliver short-answer follow-ups to concept questions – single-right answer multiple choice questions that require minimal to no calculations and ask students to operationalize recently learned concepts. To develop LLMs, 3687 responses across three thermodynamics questions were collected from consenting students from a diverse array of two- and four-year institutions between 2012 and 2024. We then evaluated state-of-the-art (SOTA) LLMs, including GPT-4 and GPT-4o-mini, via in-context learning. We also fine-tuned and evaluated open-source SOTA LLMs Llama-3-8B, Phi-3.5-mini, and Mixtral 8x7B Instruct to examine the accuracy and precision of automated code generation. Through this work, we aim to develop the CW AI, a tool to assist instructors and researchers with analyzing short-answer justifications written by students to explain their reasoning to concept questions.

A two-stage coding process was utilized to code short-answer responses, where we coded for cognitive resources, or “chunks” of knowledge activated within that response. For example, in the responses to a thermodynamics question about the enthalpy of mixing non-ideal solutions, we found that cognitive resources related to molecular reasoning weren’t as productive for students as macro-level reasoning skills, antithetical to the findings in the thermodynamics literature. We observed that open-source models like Mixtral 8x7B Instruct and Llama-3 perform best on in-domain tasks. Finally, we discuss our progress toward developing the CW AI, which has broader implications for engineering education researchers and instructors regarding using GenAI in classrooms.