Semiparametric Classification under a Forest Density Assumption
Loading...
Date
2017-04-24
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This dissertation proposes a new semiparametric approach for binary classification that exploits the modeling flexibility of sparse graphical models. This approach is based on non-parametrically estimated densities, which are notoriously difficult to obtain when the number of dimensions is even moderately large. In this work, it is assumed that each class can be well-represented by a family of undirected sparse graphical models, specifically a forest-structured distribution. By making this assumption, non-parametric estimation of only one- and two-dimensional marginal densities are required to transform the data into a space where a linear classifier is optimal.
This work proves convergence results for the forest density classifier under certain conditions. Its performance is illustrated by comparing it to several state-of-the-art classifiers on simulated forest-distributed data as well as a panel of real datasets from different domains. These experiments indicate that the proposed method is competitive with popular methods across a wide range of applications.
Description
Keywords
classification, nonparametric density estimation, forests, machine learning