2025 Spring Meeting and 21st Global Congress on Process Safety

(98c) Autonomous Industrial Control Using an Agentic Framework with Large Language Models

Checkout You must be logged in to view this content. Log in now.

Pricing

Individuals

List Price	225.00
AIChE Pro Members	150.00
AIChE Emeritus Members	105.00
Employees of CCPS Member Companies	150.00
AIChE Graduate Student Members	Free
AIChE Undergraduate Student Members	Free

Authors

Javal Vyas - Presenter, National Energy Technology Support Contractor

Mehmet Mercangöz, ABB

Chemical plants today are moving towards autonomous operations. Especially for routine operations that follow well-defined procedures autonomous operation is considered possible [1]. However, the handling of anomalies remains a critical challenge in chemical plant automation and most of the time the responsibility to manage the operation is in the hands of human operators as the anomalous situation unfolds. Although existing machine learning models have made significant progress in handling known unknowns such as predictable disturbances or plant-model mismatches, they often fail in handling anomalies. This is primarily because these models are trained on majority-class data, as anomaly data is scarce or available in too few samples [2]. As a result, models struggle to detect and react to anomalies in real-time, particularly in scenarios involving unknown unknowns—unforeseen disturbances that the system was not designed to handle. The overarching goal of this work is to bridge this gap in autonomous systems using Large Language Models (LLMs) as intelligent control agents.

LLMs, with their extensive knowledge bases and reasoning capabilities, represent a promising avenue for developing intelligent control agents capable of autonomously analyzing incoming data, diagnosing anomalies, and making informed control decisions [3]. The challenge is transitioning to a fully automated system that can evaluate responses and adjust actions independently if deemed unsafe. To address this, we propose a reprompting architecture that empowers LLMs to function as autonomous control agents. This architecture enables agents to validate their actions against a digital twin, implementing them in the physical system if they pass validation; if not, the agent is prompted to revise its approach. This iterative process significantly enhances decision-making capabilities and improves system performance in real-time. To illustrate the potential of this approach, we present a case study focused on temperature control using a physical microcontroller.

The proposed framework introduces a modular and adaptive LLM-based multi-agent system, with a focus on programmatically leveraging a Reprompting via Reprompter Agent to guide an Actor Agent toward safe and optimal solutions. Each agent is assigned a specific role, equipped with tools, and tasked with distinct actions that contribute to the overarching system objectives. This section outlines the framework’s role in enhancing system reliability and responsiveness through a coordinated agent-based approach.

Fig 1 – Proposed Framework Schematic Representation

The core of this framework is built around three principal agents—the Actor Agent, Validator Agent, and Reprompter Agent—that interact with a simulated digital twin environment (see Fig 1). This digital twin serves as a controlled proxy for the physical system, enabling safe validation of actions and structured feedback loops.

Actor Agent: The Actor Agent initiates actions aimed at achieving control objectives, such as modifying parameters or toggling operational states. It operates based on predefined goals, and once it formulates an action, the Actor Agent passes this decision to the digital twin. This simulation evaluates the potential effects of the action, minimizing the risk of unsafe interventions on the physical system.
Digital Twin Simulation: The digital twin emulates the behavior of the physical system in response to the Actor Agent’s actions, enabling real-time assessment in a low-risk environment. This simulated feedback captures anticipated system responses, allowing agents to test actions safely before deployment.
Validator Agent: Following the simulation, the Validator Agent assesses the Actor Agent’s proposed action based on safety and operational criteria. If the action meets these criteria, it is ready for physical deployment. However, if it is deemed unsafe or suboptimal, the Validator Agent flags the action, prompting the Reprompter Agent to intervene for a predefined iterations after which, if the actions are unsafe, the safety system would override the actions.
Reprompter Agent: The Reprompter Agent is a pivotal component in ensuring system safety and refinement. When an action fails validation, the Reprompter Agent collaborates with the Actor Agent to adjust the initial decision. Using alternative prompts and iterative instructions, the Reprompter Agent reprograms the action sequence until it aligns with the Validator Agent’s criteria. This process forms a feedback loop in which each iteration is tested in the digital twin and validated again, ensuring the action is both safe and optimized. The loop persists until the action either satisfies validation standards or reaches a predefined limit on iterations, safeguarding stability in the control process.

The following case study demonstrates the application of the proposed LLM-based multi-agent framework to autonomously control a physical Arduino microcontroller known as TCLab [4]. The setup aims to manage heater operations based on specific temperature thresholds: heaters are turned off when the temperature exceeds 27°C and turned on when it falls below 25°C. This creates a cyclical oscillation within these thresholds, with the control sequence monitored over a 40-minute period. The goal of this case study is to assess how effectively the proposed multi-agent framework improves via the use of re-prompting for autonomous control under real-world conditions.

The performance of the proposed framework, leveraging OpenAI’s large language models (LLMs) suite as control agents, was evaluated within a temperature regulation case study. This evaluation centered on the agents’ accuracy in executing control actions. We measured accuracy across

two settings—initial pass accuracy and accuracy post reprompting—to analyze the models’ ability to correct missteps autonomously.

Metric	GPT 3.5	GPT 4o mini	GPT 4o	GPT 4
Accuracy – first pass (%)	60.04	72.49	99.63	93.75
Accuracy – reprompts (%)	85.34	89.97	99.81	96.09
Samples	423	394	554	128
Passes	254	253	552	120
Fails	169	61	2	8
Pass after reprompts	107	96	1	3

Table 1 – Performance of language models in proposed frameworks

Table 1 presents the performance metrics for four language models—GPT 3.5-turbo, GPT 4o mini, GPT 4o, and GPT 4—within the proposed CrewAI framework for temperature regulation tasks. The results indicate that the framework is effective, with GPT 4o achieving the highest initial accuracy at 99.63%. Other models also performed well, with GPT 4 at 93.75%, GPT 4o mini at 72.49%, and GPT 3.5-turbo at 60.04% Fig. 2. Temperature Profile for GPT 3.5 Fig. 3. Temperature Profile for GPT 4o-mini Fig. 4. Temperature Profile for GPT 4o. These variations demonstrate the differing capabilities of the models within the framework. Significantly, the introduction of reprompting resulted in notable improvements across all models. For example, GPT 3.5-turbo’s accuracy increased from 60.04% to 85.34%, while GPT 4o mini improved from 72.49% to 89.97%. Even the higher-performing models, GPT 4 and GPT 4o, saw gains, reaching 96.09% and 99.81%, respectively. This illustrates that the reprompting mechanism effectively enhances model performance, validating its role in the framework. These results confirm that the proposed framework effectively utilizes LLMs as control agents, and the reprompting mechanism significantly enhances accuracy and reliability, especially for models with initially lower performance.

Fig 2 – Temperature profile GPT 3.5 Fig 3 - Temperature profile GPT 4o mini

Fig 4 – Temperature profile GPT 4o Fig 5 - Temperature profile GPT 4

These findings validate the viability of LLM-based systems in autonomous industrial control, where rapid and precise decision-making is essential. While this case study focused on a relatively straightforward task, the adaptability of the framework positions it well for application in more complex control scenarios. Future work may explore its deployment in fault detection and digital twin environments, where real-time decision-making is critical in dynamic and unpredictable settings. Overall, this research supports the integration of LLMs with reprompting architecture as a vital component toward realizing fully autonomous and intelligent industrial systems.

References:

[1] Borghesan, F., Zagorowska, M., and Mercangoz, M.(2022). Unmanned and autonomous systems: Future of automation in process and energy industries. IFAC-PapersOnLine, 55(7), 875–882.

[2] Hanga, K.M. and Kovalchuk, Y. (2019). Machine learning and multi-agent systems in oil and gas industry applications: A survey. Computer Science Review, 34, 100191

[3] Pantelides, C., Baldea, M., Georgiou, A.T., Gopaluni, B., Mehmet, M., Sheth, K., Zavala, V.M., and Georgakis, C. (2024). From automated to autonomous process operations. doi:10.2139/ssrn.4963632.

[4] Oliveira, P.M. and Hedengren, J.D. (2019). An apmonitor temperature lab pid control experiment for undergraduate students. In 2019 24th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), 790–797. IEEE.

Breadcrumb

2025 Spring Meeting and 21st Global Congress on Process Safety

(98c) Autonomous Industrial Control Using an Agentic Framework with Large Language Models

Authors