2006 AIChE Annual Meeting
(679b) Model Predictive Discrimination Approach for Classification of Process and Biological Data
Authors
The basic idea adopted in the study is to form a variable predictive model representing the associations between attributes for each class. Each variable is modeled as a function of remaining all the variables or optimally selected subset of variables. The model parameters are estimated using the training data. The variable predictive model structure, designed for a particular class, distinctly characterizes the intra-class attribute relations for that class. It is hypothesized that a given sample observation of all the variables gives the best prediction for the class model which it belongs to. The hypothesis is tested using prediction capabilities defined based on statistical error. The sample to be tested is projected on each class model to re-predict the full set of variables and the prediction accuracy is used as the discriminating criteria.
The new Model Predictive Discrimination (MPD) algorithm for classification problems is validated using illustrative and well studied classification examples from process and biological applications. Experimental and proven literature data sets are utilized as benchmark. The performance of the new method is compared with classical classification algorithms like LDA, QDA, SVM and ANN. MPD method is observed to perform well for all the cases. The results are superior to existing methods especially for data sets with nonlinear variable interactions. The MPD classification method can be successfully extended to many fault detection and diagnosis applications in process and biological systems.
Keywords: Discriminant analysis, data classification, Fault detection and diagnosis, Computational biology, Pattern recognition.
[1] R.O. Duda, P.E. Hart and D.G. Stork, Pattern classification, John Wiley, New York, 2000.
[2] Y. Tominaga, Comparative study of class data analysis with PCA-LDA, SIMCA, PLS, ANNs, and k-NN, Chemometric Intel. Lab. Syst. 49, 105-115, 1999
[3] R. Sokal and F. J. Rohlf, Biometry : The principles and practice of statistics in biological research. 3rd. Edition, W.H.Freeman & co., New York, 1995
[4] S. Dudoit, J. Fridlyand, and T.P. Speed, Comparison of discrimination methods for the classification of tumors using gene expression data, Jrnl. Amer. Statist. Assoc. 97, 7787, 2002.