Bayesian Variable Selection in High Dimensional Genomic Studies Using Nonlocal Priors

Nikooienejad, Amir

dc.contributor.advisor	Johnson, Valen E.
dc.creator	Nikooienejad, Amir
dc.date.accessioned	2019-01-16T20:43:11Z
dc.date.available	2019-12-01T06:34:07Z
dc.date.created	2017-12
dc.date.issued	2017-12-01
dc.date.submitted	December 2017
dc.identifier.uri	https://hdl.handle.net/1969.1/173161
dc.description.abstract	The advent of new genomic technologies has resulted in production of massive data sets. The outcomes in such experiments are often binary vectors or survival times, and the covariates are gene expressions obtained from thousands of genes under study. Analysis of these data, especially gene selection for a specific outcome, requires new statistical and computational methods. In this dissertation, I address this problem and propose one such method that is shown to be advantageous in selecting explanatory variables for prediction of binary responses and survival times. I adopt a Bayesian approach that utilizes a mixture of nonlocal prior densities and point masses on the regression coefficient vectors. The proposed method provides improved performance in identifying true models and reducing estimation and prediction error rates in a number of simulation studies for both binary and survival outcomes. I also describe a computational algorithm that can be used to implement the methodology in ultrahigh-dimensional settings (p ≫ n). In particular, for survival response datasets I show that MCMC is not feasible and instead provide a computational algorithm based on a stochastic search algorithm that is scalable and p invariant. As part of the variable selection methodology, I also propose a novel approach for setting prior hyperparameters by examining the total variation distance between the prior distributions on the regression parameters and the distribution of the maximum likelihood estimator under the null distribution. An R package, BVSNLP, is also introduced in this dissertation as a final product which contains all described methodology here. It performs high dimensional Bayesian variable selection for binary and survival outcome datasets that is expected to have a variety of applications including cancer genomic studies. Another problem that is addressed in this dissertation is methodology for deriving and extending Uniformly Most Powerful Bayesian tests (UMPBTs) from exponential family distributions to a larger class of testing contexts. UMPBTs are an objective class of Bayesian hypothesis tests that can be considered the Bayesian counterpart of classical uniformly most powerful tests. However, they have previously been exposed for application in one parameter exponential family models. I introduce sufficient conditions for the existence of UMPBTs and propose a unified approach for their derivation. An important application of my methodology is the extension of UMPBTs to testing whether the noncentrality parameter of a x^2 distribution is zero.	en
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.subject	Bayesian Variable Selection	en
dc.subject	Nonlocal Priors	en
dc.subject	High Dimensional Analysis	en
dc.subject	Cancer Genomics	en
dc.subject	Binary Response Data	en
dc.subject	Survival Data	en
dc.subject	Uniformly Most Powerful Bayesian Tests (UMPBT)	en
dc.subject	R package	en
dc.subject		en
dc.title	Bayesian Variable Selection in High Dimensional Genomic Studies Using Nonlocal Priors	en
dc.type	Thesis	en
thesis.degree.department	Statistics	en
thesis.degree.discipline	Statistics	en
thesis.degree.grantor	Texas A & M University	en
thesis.degree.name	Doctor of Philosophy	en
thesis.degree.level	Doctoral	en
dc.contributor.committeeMember	Wang, Wenyi
dc.contributor.committeeMember	Bhattacharya, Anirban
dc.contributor.committeeMember	Sivakumar, Natarajan
dc.type.material	text	en
dc.date.updated	2019-01-16T20:43:11Z
local.embargo.terms	2019-12-01
local.etdauthor.orcid	0000-0003-4696-8300

Files in this item

Name:: NIKOOIENEJAD-DISSERTATION-2017.pdf
Size:: 525.0Kb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Electronic Theses, Dissertations, and Records of Study (2002– )
Texas A&M University Theses, Dissertations, and Records of Study (2002– )

Show simple item record