A Comprehensive Approach for Sparse Principle Component Analysis using Regularized Singular Value Decomposition
Abstract
Principle component analysis (PCA) has been a widely used tool for statistics and data analysis for many years. A good result of PCA should be both interpretable and accurate. However, neither interpretability nor accuracy could be achieved well in “big data” scenarios where there are large numbers of original variables. Therefore people developed sparse PCA, in which obtained principle components (PCs) are linear combinations of a limited number of original variables, which yields good interpretability. In addition, some theoretical results showed that, when the genuine model is sparse, PCs obtained via sparse PCA instead of traditional PCA are consistent estimators. These aspects have made sparse PCA a hot research topic in recent years.
In this dissertation, we developed a comprehensive and systematic way for doing sparse PCA by using an SVD-based approach. In detail, we proposed the formulation and algorithm and showed its consistency and convergence. We even showed convergence to global optima using a limited number of trials, which is a breakthrough in sparse PCA area. In addition, to guarantee orthogonality or uncorrelatedness when multiple PCs are extracted, we developed a method for sparse PCA with orthogonal constraint, proposed its algorithm, and showed the convergence. In addition, to deal with missing values in the design matrix which often happens in reality, we developed a method for sparse PCA with missing values, proposed its algorithm, and showed the convergence. Moreover, to provide a good way of selecting tuning parameter in these formulations, we designed an entry-wise cross validation method based on sparse PCA with missing values. All these contributions and breakthroughs make our results practically useful and theoretically complete. Simulation study and real world data analysis are also provided, which showed that our method has competing results with others in “without missing” cases, and good results in “with missing” cases in which currently we are the only practical method.
Subject
Principal Component AnalysisSparse PCA
Singular Value Decomposition
Regularized SVD
Alternating Direction
Block Coordinate Descent
Regularity
Power Iteration
Global Optima
Orthogonal Constraint
Missing Values
Cross-Validation.
Citation
Liu, Senmao (2016). A Comprehensive Approach for Sparse Principle Component Analysis using Regularized Singular Value Decomposition. Doctoral dissertation, Texas A&M University. Available electronically from https : / /hdl .handle .net /1969 .1 /192022.