Feature Selection for Unsupervised and Supervised Learning
Abstract
Unsupervised and semi-supervised learning are explored in convex clustering with metric
learning while supervised learning is explored in a novel feature selection method. First, we evaluate
the performance of convex clustering against previous clustering formulations. Moreover,
we implement two metric learning schemes in convex clustering to replace the Euclidean distance
used in the original convex clustering formulation. The first metric learning scheme involves using
a full-rank positive definite matrix to characterize a Mahalanobis metric and the second metric
learning scheme involves using a sparse compositional metric. This sparse compositional metric is
a weighted sum of a set of orthonormal rank-1 basis vectors. In experimentation on both simulated
data and real life data, convex clustering with metric learning, especially a sparse compositional
metric, can outperform convex clustering, other methods based on convex clustering and previous
popular clustering algorithms. Second, a novel feature selection method is proposed using
Chow-Liu tree approximations to estimate Shannon’s mutual information. In experimental analysis,
this Chow-Liu tree feature selection method out performs previous feature selection method
when classification accuracy is used as a performance measure.
Subject
ClusteringConvex
Metric Learning
Sparsity
Feature Selection
Chow-Liu Tree
Mutual Information
Citation
Sui, Xiaopeng (2018). Feature Selection for Unsupervised and Supervised Learning. Doctoral dissertation, Texas A & M University. Available electronically from https : / /hdl .handle .net /1969 .1 /174440.