Semiparametric Classification under a Forest Density Assumption

Loading...
Thumbnail Image

Date

2017-04-24

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This dissertation proposes a new semiparametric approach for binary classification that exploits the modeling flexibility of sparse graphical models. This approach is based on non-parametrically estimated densities, which are notoriously difficult to obtain when the number of dimensions is even moderately large. In this work, it is assumed that each class can be well-represented by a family of undirected sparse graphical models, specifically a forest-structured distribution. By making this assumption, non-parametric estimation of only one- and two-dimensional marginal densities are required to transform the data into a space where a linear classifier is optimal. This work proves convergence results for the forest density classifier under certain conditions. Its performance is illustrated by comparing it to several state-of-the-art classifiers on simulated forest-distributed data as well as a panel of real datasets from different domains. These experiments indicate that the proposed method is competitive with popular methods across a wide range of applications.

Description

Keywords

classification, nonparametric density estimation, forests, machine learning

Citation