2020 Virtual AIChE Annual Meeting

(719a) Trend-Based Assessment of Industrial Rotating Equipment Health

Authors

Toothman, M. - Presenter, University of Michigan
Barton, K., University of Michigan
Moyne, J., Applied Materials
Wright, R., Dow Inc.
Dessauer, M., Dow Chemical
Bury, S., Dow Inc.
Motivation and need:

Rotating equipment is widespread throughout the process industry, where pumps, compressors, and turbines are commonly used to drive continuous manufacturing lines. Unplanned downtime due to equipment failure is an acute problem in the process industry, where any break in a continuous manufacturing operation can halt production across an entire plant. To prevent these kinds of interruptions, manufacturing plants often carry out maintenance procedures according to pre-defined schedules while keeping backup equipment at the ready in case of unexpected failures. This approach to maintenance is inherently reactive and can be costly for manufacturing facilities. Consequently, there is a need for strategies to diagnose rotating equipment health and make predictions about future machine states. This information can be used to schedule equipment maintenance in a manner that prevents unexpected failures from occurring while avoiding unnecessary maintenance procedures.

Related work:

Predictive maintenance (PdM) is a growing topic of research that aims to decrease the cost and redundancy associated with maintaining industrial equipment. A critical aspect of PdM is equipment health diagnosis, a process that involves using features extracted from machine signals to estimate the current health state of equipment. In industry, a common approach to equipment health diagnosis involves using subject matter expertise to establish process control regions for individual signal features. Deviations from a baseline healthy region can act as a trigger for maintenance actions. This approach is transparent and easily scalable for large fleets of industrial rotating equipment, but can be ineffective for complex equipment failures that can only be detected by monitoring multiple features simultaneously.

Recent literature has focused on using experimental data to build equipment health models for the purpose of diagnosis. A number of high-performance models have been developed for individual machines in highly-controlled testing environments, but there has been little work to demonstrate the effectiveness of these methods for industrial rotating equipment [1, 2, 3]. Data-driven frameworks have also been proposed to facilitate the diagnosis of machines in industrial equipment fleets. Models in these frameworks are often structured as statistical thresholds [4], neural networks [5], or support vector machines [6], and are trained to distinguish between healthy equipment behavior and various modes of faulty equipment behavior. These classifiers can make diagnoses using features extracted from individual snapshots of machine signals, but are unable to analyze a time-series sequence of features, which is necessary to reliably identify the onset of equipment degradation in real-time. Building these models also requires training data from experiments conducted with healthy and degraded equipment, which may be infeasible for fleets of process manufacturing equipment.

Approach:

This work presents a novel approach for modeling the life of industrial rotating equipment that is designed to facilitate equipment health diagnosis. The equipment health timeline depicted in Figure 1 represents a typical life cycle, in which a machine operates in a healthy state for some period of time, until it encounters a degradation onset event, after which it operates in degraded state that can culminate in failure, if left unrepaired. Using this modeling approach, the process of equipment health diagnosis is equivalent to determining whether a machine is operating in a healthy or degrading state. The determination that it is in a degrading state can be used to predict a future failure event, thereby avoiding unscheduled downtime, but also to identify an ineffective repair operation, thereby reducing the occurrence of frequent consecutive downtime events.

It should be noted that [7] presents a similar model of equipment health for the purpose of optimizing maintenance inspections and backup equipment placement. Their approach makes the assumption that degrading behavior can always diagnosed through inspection within a constant delay period preceding failure. This work does not assume that machines exhibit a constant delay period and is concerned with the methodology behind equipment health diagnosis. A trend-based diagnosis strategy that is well-suited to identify time-series behaviors associated with equipment degradation is presented here. Additionally, the models used to diagnosis equipment health are built using unlabeled historical data collected between repair and failure events, eliminating the need for experimental training data. A series of offline processes: feature selection, state labeling, and model building, along with an online diagnosis process, are included in this strategy.

To account for differences in rotating equipment and sensing capabilities across manufacturing facilities, the proposed strategy incorporates subject matter expertise into the feature selection process. This process involves specifying a selector function that maps the available signal features to a single, scalar health index that will be monitored for the purpose of health diagnosis. An appropriate function can be as simple as selecting a single feature that is known to show dynamic behavior during equipment degradation, or it may draw from historical data to combine the available features into a multivariate health index.

Historical data for a piece of rotating equipment is then condensed into a time series of health index measurements, punctuated by instantaneous repair and failure events. Data in this form has traditionally been difficult to analyze because health state labels are not readily available. The proposed state labeling process makes the assumption that post-repair measurements can be considered “healthy” while pre-shutdown measurements can be labeled “degrading” and iterates over all possible subsets to select the series of measurements that adheres most closely to a linear degradation trend. The beginning of this optimal subset is an estimate of when the transition from a “healthy” state to a “degrading” state has occurred, and the index measurements before and after this transition can be labeled accordingly.

These labeled data sets are then used to build a novel, trend-based equipment health model. A hidden Markov model structure is used here to represent two health states (healthy and degrading) via a Markov chain. To capture the time-series behavior that characterizes healthy and degrading states, observations in this hidden Markov health model are the slopes of regression lines fit to a set of windowed health index measurements. The observation probability distribution in each state is assumed to be Gaussian and can be characterized by windowing the historical health index measurements for each state. Characterizing health index trends results in an equipment health model that is more robust to signal offsets, which are common in manufacturing environments, and presents an opportunity to detect faults and failures that could not be identified by purely value-based classifiers.

The online diagnosis process uses the previously-defined selector function to collect health index measurements from rotating equipment in real-time. These indices can be windowed to extract the regression slope values that are inputs to a Viterbi algorithm, a well-defined method for using a hidden Markov model to infer the most likely state sequence given a series of unlabeled observations. The health diagnosis at any given time is simply the final state in this sequence.

Case study:

The proposed diagnosis strategy has been tested using data from a pump system operated by Dow Chemical Company. In the data set, three equipment failures were not identified until their later stages because the existing control region strategy did not provide a complete representation of equipment health. This trend-based diagnosis strategy was used to retroactively build a health model using historical vibration measurements, collected prior to the first failure, and then diagnose the equipment health state after each failure event, as shown in Figure 2. In all three cases, the estimated health state converges to the true health state specified by maintenance logs for this piece of equipment.

References:

[1] L. Guo, N. Li, F. Jia, Y. Lei, and J. Lin, “A recurrent neural network based health indicator for remaining useful life prediction of bearings,” Neurocomputing, vol. 240, pp. 98-109, May 2017.

[2] P. Li, X. Jia, J. Feng, H. Davari, G. Qiao, Y. Hwang, and J. Lee, “Prognosability study of ball screw degradation using systematic methodology,” Mechanical Systems and Signal Processing, vol. 109, pp. 45-57, Sept. 2018.

[3] W. Ahmad, S. A. Khan, and J. Kim, “A hybrid prognostics technique for rolling element bearings using adaptive predictive models,” IEEE Transactions on Industrial Electronics, vol. 65, no. 2, pp. 1577-1584, Feb. 2018.

[4] W. Yu, T. Dillon, F. Mostafa, W. Rahayu, and Y. Liu, “A global manufacturing big data ecosystem for fault detection in predictive maintenance,” IEEE Transactions on Industrial Informatics, vol. 16, no. 1, pp. 183-192, Jan. 2020.

[5] Y. Lei, F. Jia, J. Lin, S. Xing, S. X. Ding, “An Intelligent Fault Diagnosis Method Using Unsupervised Feature Learning Towards Mechanical Big Data,” IEEE Transactions on Industrial Electronics, vol. 63, no. 5, pp. 3137-3147, May 2016.

[6] J. Wang, L. Zhang, L. Duan, and R. X. Gao, “A new paradigm of cloud-based predictive maintenance for intelligent manufacturing,” Journal of Intelligent Manufacturing, vol. 28, pp. 1125-1137, 2017.

[7] Y. Ye, I. Grossman, J. Pinto, and S. Ramaswamy, “Modeling for reliability optimization of system design and maintenance based on Markov chain theory,” Computers & Chemical Engineering, vol. 124, pp. 381-404, May 2019.