Interval Principal Component Analysis and Its Application to Fault Detection and Data Classification
Abstract
Principal Component Analysis (PCA) is a linear data analysis tool that aims to reduce the dimensionality of a dataset, while retaining most of the variation found in it. It transforms the variables of a dataset into a new set of variables, called principal components, using linear combinations of the original variables. PCA is a powerful statistical technique used in research for fault detection, classification and feature extraction. Interval Principal Component Analysis (IPCA) is an extension to PCA designed to apply PCA to large datasets using interval data generated from single-valued samples. In this thesis, three IPCA methods are introduced: Centers IPCA (CIPCA), Midpoint-Radii IPCA (MRIPCA), and Symbolic Covariance IPCA (SCIPCA). In addition, the methods and parameters used for fault detection and classification applications are described for classical and interval data.
The performance of the methods used for interval generation in IPCA are analyzed under different conditions. Moreover, three synthetic datasets were used to test the fault detection performances of all methods, and three real datasets were used to test their classification performances. The results show that IPCA methods have a higher detection rate than classical PCA on average, for the same false alarm rate. Moreover, unlike PCA, IPCA methods are capable of accurately differentiating the type of fault. Interval centers were capable of detecting changes in mean, while interval radii were capable of detecting changes in variance. On the other hand, for data classification, the results show that MRIPCA has the highest precision on average than other methods.
Subject
Principal Component AnalysisInterval Data
Midpoint-Radii
Symbolic Covariance
Fault Detection
Data Classification
Citation
Basha, Nour Mansour Abdelhafez Mohamed (2018). Interval Principal Component Analysis and Its Application to Fault Detection and Data Classification. Master's thesis, Texas A & M University. Available electronically from https : / /hdl .handle .net /1969 .1 /173310.