2022 Spring Meeting and 18th Global Congress on Process Safety Proceedings
(167a) Safecrit and Digital Twin: An Integrated Set of Technologies for Protecting Water Treatment and Distribution Plants Against Extreme Cyber Attacks
Authors
SafeCrit technologies
A brief description of key technologies that comprise SafeCrit follows. DAD is a design-centric process anomaly detector. State entanglement, a patented method, is a key to the ultra-high anomaly detection rate of DAD. The idea is to use fundamental laws of physics to create relations across plant components. These relations, known as invariants, are programmed to monitor the underlying plant process.
AICrit uses machine learning and design knowledge to create models that capture the normal spatio-temporal relationships among sets of correlated components. It models the normal behavior of continuous-valued state variables (sensors) through the temporal dependencies to forecast their behavior. In addition, it also models the higher-order and non-linear correlations among the discrete (actuators) and continuous state variables.
SuperDetector fuses the inputs from real-time operational data from the plant with that received from AICrit and DAD. It uses a stacked ensemble of Random Forest, Bagging with k-Nearest Neighbours and Gaussian Na\"ive Bayes algorithms to classify normal and anomalous data. This results in high accuracy of detection and a low rate of false alarms.
PlantProtect thwarts an attackerâs attempt at moving the plant to an anomalous state. It prevents a rogue command from the controller to reach its target actuator. A command sent to an actuator is first trapped and validated and released only when the actual and predicted commands are the same, else it the plant operator is requested for an appropriate action.
Embedding SafeCrit in a digital twin
iTrust has developed a digital twin that mimics the operation of a 6-stage water treatment plant named SWaT. The twin is a fully distributed system that includes over 50~processes that mimic the behavior of SCADA, Engineering Workstation, PLCs, RIOs, actuators, sensors, and the historian. A professional HMI, also developed at iTrust, is integrated with the twin. The twin uses the now industry standard OPC protocol for communications among PLCs and SCADA and the well known ZMQ protocol for communications across other processes. Modbus and ENIP protocols are currently being added to the twin as options.
All SafeCrit technologies have been integrated into the twin. Doing so has enabled researchers and educators to understand and test SafeCrit within a virtual environment at a much faster pace than they could on a live plant. The twin has been used in the world's largest cyber-exercise, named Locked Shields, conducted each year by the NATO Cooperative Cyber Defence Centre of Excellence. In this exercise, 26 instances of the twin were deployed and attacked by teams from NATO countries.
Quality requirements for SafeCrit
A major challenge faced by developers of cyber security technologies arises from the fact that extreme cyber attacks are rare when compared with the almost daily attempted, and often successful, network intrusions. However, when an extreme cyber attack occurs in a critical infrastructure, it could lead to significant financial and social chaos. This observation leads to stringent quality requirements for any cyber security technology aimed at defending water treatment and distribution plants.
From the onset, SafeCrit was required to demonstrate ultra-high rate of anomaly detection and ultra-low rate of false alarms. An ideal requirement is Zero False Alarms, i.e., ZeFA and 100% anomaly detection rate. While both requirements are equally important, ZeFA is often overestimated. Doing so may lead to a false alarm rate that might appear exceptionally low but may lead to an unusable technology. This observation arises when the rate at which data is received is high, e.g., every second. In this situation, even a 0.1% rate of false alarms would imply about 86 false alarms per day, which is obviously unacceptable for any plant operator. Thus, for SafeCrit, the rate of false alarms was set to 1 in 30~days and an anomaly detection rate of 95%.
Test infrastructure
In addition to the digital twin, iTrust offers two operational testbeds for water treatment and water distribution.These testbeds are referred to as SWaT for water treatment and WADI for water distribution. The testbeds mimic water treatment and distribution processes found in city-scale plants. These testbeds are used for continuous testing of SafeCrit technologies.
The Secure Water Treatment (SWaT) testbed contains a 6-stage water treatment plant producing 5-gallons per minute of water filtered using ultrafiltration and reverse osmosis. The Water Distribution (WADI) plant contains three stages to distribute 5000Litres of water to 6-consumers. The testbeds are connected to allow filtered water from SWaT to enter the primary grid in WADI, dosed chemically, and passed to the secondary grid. Water flows by gravity from the secondary grid to the consumer tanks. Booster pumps are used when the consumer demand cannot be met with gravity flow.
Testing SafeCrit
SafeCrit has been subjected to three increasingly intense levels of testing. At the lowest level, researchers involved in the development of SafeCrit tested it against an iTrust benchmark of extreme cyber attacks. This benchmark contains single-point and multi-point attacks that aim at deceiving a Programmable Logic Controller (PLC) into concluding that the spoofed data is correct. At the next level, SafeCrit is tested by independent attackers participating in an international exercise known as Critical Infrastructure Showdown (CISS) organized by iTrust. At the highest level, parts of SafeCrit are tested at city-scale water treatment plant.
Deploying SafeCrit
The deployment strategy for SafeCrit depends on the plant architecture. Deployment could be fully non-invasive or invasive. Non-invasive deployment is done by interfacing SafeCrit with the plant historian. Data from the historian is passed to SafeCrit via a data diode ensuring one-way communication and plant safety. Thus, at every sampling instant, SafeCrit receives plant state via the historian and generates an alert when the plant moves into an anomalous state.
Non-invasive deployment limits SafeCrit's capability in realizing its objectives. For example, in non-invasive mode, PlantProtect is unable to receive data from level~0 network and hence cannot use the trap-validate-release strategy. However, it can still correct sensor states based on predicted values. While state predictions in SafeCrit are accurate, a clever attacker can still deceive SafeCrit by launching multi-point attacks.
Invasive deployment requires creation of the level~0 network. This is done by tapping the commands from PLC sent to the actuators via the Remote IO (RIO). In addition, selected invariants in DAD are also added to the control code in each PLC. Such deployment leads to a significantly more robust SafeCrit making it extremely challenging for an attacker to bypass its detection and protection mechanisms.
Can SafeCrit fail?
SafeCrit's anomaly detection rate is high but not 100%. Its detectors may fail, for example, when an attacker is aware of the thresholds used. These thresholds are carefully tuned to lead to ZeFA and 100% detection. A ``slow attack" that remains below the thresholds, may remain undetected for long. However, experiments have revealed that such attacks might fail to cause any plant damage or service disruption given the standard engineering practice of using safety margins during design. For example, the overflow level of a water tank is usually set to below the true height of the tank. Thus, while a ``virtual overflow" could occur, an attack would be detected before water actually overflows the tank. While SafeCrit has been able to achieve ZeFA over long plant runs, it does not, and cannot, guarantee such performance at all times. Thus, alarms could begin to occur when plant components degrade requiring retuning of SafeCrit parameters.
Status of SafeCrit
SafeCrit technologies are currently being licensed for conducting pilots and incremental deployment.(Contact: Director, Office of Research and Industry Collaborations; research@sutd.edu.sg).