Feature Selection for Unsupervised and Supervised Learning

Sui, Xiaopeng

View/ Open

SUI-DISSERTATION-2018.pdf (1018.Kb)

Date

2018-11-01

Author

Sui, Xiaopeng

Metadata

Show full item record

Abstract

Unsupervised and semi-supervised learning are explored in convex clustering with metric learning while supervised learning is explored in a novel feature selection method. First, we evaluate the performance of convex clustering against previous clustering formulations. Moreover, we implement two metric learning schemes in convex clustering to replace the Euclidean distance used in the original convex clustering formulation. The first metric learning scheme involves using a full-rank positive definite matrix to characterize a Mahalanobis metric and the second metric learning scheme involves using a sparse compositional metric. This sparse compositional metric is a weighted sum of a set of orthonormal rank-1 basis vectors. In experimentation on both simulated data and real life data, convex clustering with metric learning, especially a sparse compositional metric, can outperform convex clustering, other methods based on convex clustering and previous popular clustering algorithms. Second, a novel feature selection method is proposed using Chow-Liu tree approximations to estimate Shannon’s mutual information. In experimental analysis, this Chow-Liu tree feature selection method out performs previous feature selection method when classification accuracy is used as a performance measure.

Citation

Sui, Xiaopeng (2018). Feature Selection for Unsupervised and Supervised Learning. Doctoral dissertation, Texas A & M University. Available electronically from https : / /hdl .handle .net /1969 .1 /174440.