Show simple item record

dc.contributor.advisorJohnson, Valen E.
dc.creatorNikooienejad, Amir
dc.date.accessioned2019-01-16T20:43:11Z
dc.date.available2019-12-01T06:34:07Z
dc.date.created2017-12
dc.date.issued2017-12-01
dc.date.submittedDecember 2017
dc.identifier.urihttps://hdl.handle.net/1969.1/173161
dc.description.abstractThe advent of new genomic technologies has resulted in production of massive data sets. The outcomes in such experiments are often binary vectors or survival times, and the covariates are gene expressions obtained from thousands of genes under study. Analysis of these data, especially gene selection for a specific outcome, requires new statistical and computational methods. In this dissertation, I address this problem and propose one such method that is shown to be advantageous in selecting explanatory variables for prediction of binary responses and survival times. I adopt a Bayesian approach that utilizes a mixture of nonlocal prior densities and point masses on the regression coefficient vectors. The proposed method provides improved performance in identifying true models and reducing estimation and prediction error rates in a number of simulation studies for both binary and survival outcomes. I also describe a computational algorithm that can be used to implement the methodology in ultrahigh-dimensional settings (p ≫ n). In particular, for survival response datasets I show that MCMC is not feasible and instead provide a computational algorithm based on a stochastic search algorithm that is scalable and p invariant. As part of the variable selection methodology, I also propose a novel approach for setting prior hyperparameters by examining the total variation distance between the prior distributions on the regression parameters and the distribution of the maximum likelihood estimator under the null distribution. An R package, BVSNLP, is also introduced in this dissertation as a final product which contains all described methodology here. It performs high dimensional Bayesian variable selection for binary and survival outcome datasets that is expected to have a variety of applications including cancer genomic studies. Another problem that is addressed in this dissertation is methodology for deriving and extending Uniformly Most Powerful Bayesian tests (UMPBTs) from exponential family distributions to a larger class of testing contexts. UMPBTs are an objective class of Bayesian hypothesis tests that can be considered the Bayesian counterpart of classical uniformly most powerful tests. However, they have previously been exposed for application in one parameter exponential family models. I introduce sufficient conditions for the existence of UMPBTs and propose a unified approach for their derivation. An important application of my methodology is the extension of UMPBTs to testing whether the noncentrality parameter of a x^2 distribution is zero.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectBayesian Variable Selectionen
dc.subjectNonlocal Priorsen
dc.subjectHigh Dimensional Analysisen
dc.subjectCancer Genomicsen
dc.subjectBinary Response Dataen
dc.subjectSurvival Dataen
dc.subjectUniformly Most Powerful Bayesian Tests (UMPBT)en
dc.subjectR packageen
dc.subjecten
dc.titleBayesian Variable Selection in High Dimensional Genomic Studies Using Nonlocal Priorsen
dc.typeThesisen
thesis.degree.departmentStatisticsen
thesis.degree.disciplineStatisticsen
thesis.degree.grantorTexas A & M Universityen
thesis.degree.nameDoctor of Philosophyen
thesis.degree.levelDoctoralen
dc.contributor.committeeMemberWang, Wenyi
dc.contributor.committeeMemberBhattacharya, Anirban
dc.contributor.committeeMemberSivakumar, Natarajan
dc.type.materialtexten
dc.date.updated2019-01-16T20:43:11Z
local.embargo.terms2019-12-01
local.etdauthor.orcid0000-0003-4696-8300


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record