Bayesian Model Selection for High-dimensional High-throughput Data

Joshi, Adarsh

dc.contributor.advisor	Johnson, Valen E.
dc.contributor.advisor	Dahl, David B.
dc.creator	Joshi, Adarsh
dc.date.accessioned	2012-07-16T15:56:11Z
dc.date.accessioned	2012-07-16T20:18:02Z
dc.date.available	2012-07-16T15:56:11Z
dc.date.available	2012-07-16T20:18:02Z
dc.date.created	2010-05
dc.date.issued	2012-07-16
dc.date.submitted	May 2010
dc.identifier.uri	https://hdl.handle.net/1969.1/ETD-TAMU-2010-05-7740
dc.description.abstract	Bayesian methods are often criticized on the grounds of subjectivity. Furthermore, misspecified priors can have a deleterious effect on Bayesian inference. Noting that model selection is effectively a test of many hypotheses, Dr. Valen E. Johnson sought to eliminate the need of prior specification by computing Bayes' factors from frequentist test statistics. In his pioneering work that was published in the year 2005, Dr. Johnson proposed using so-called local priors for computing Bayes? factors from test statistics. Dr. Johnson and Dr. Jianhua Hu used Bayes' factors for model selection in a linear model setting. In an independent work, Dr. Johnson and another colleage, David Rossell, investigated two families of non-local priors for testing the regression parameter in a linear model setting. These non-local priors enable greater separation between the theories of null and alternative hypotheses. In this dissertation, I extend model selection based on Bayes' factors and use nonlocal priors to define Bayes' factors based on test statistics. With these priors, I have been able to reduce the problem of prior specification to setting to just one scaling parameter. That scaling parameter can be easily set, for example, on the basis of frequentist operating characteristics of the corresponding Bayes' factors. Furthermore, the loss of information by basing a Bayes' factors on a test statistic is minimal. Along with Dr. Johnson and Dr. Hu, I used the Bayes' factors based on the likelihood ratio statistic to develop a method for clustering gene expression data. This method has performed well in both simulated examples and real datasets. An outline of that work is also included in this dissertation. Further, I extend the clustering model to a subclass of the decomposable graphical model class, which is more appropriate for genotype data sets, such as single-nucleotide polymorphism (SNP) data. Efficient FORTRAN programming has enabled me to apply the methodology to hundreds of nodes. For problems that produce computationally harder probability landscapes, I propose a modification of the Markov chain Monte Carlo algorithm to extract information regarding the important network structures in the data. This modified algorithm performs well in inferring complex network structures. I use this method to develop a prediction model for disease based on SNP data. My method performs well in cross-validation studies.	en
dc.format.mimetype	application/pdf
dc.language.iso	en_US
dc.subject	Bayes factors	en
dc.subject	Bayes factors based on test statistics	en
dc.subject	Bayesian Graphs	en
dc.subject	MCMC	en
dc.subject	Objective Bayesian Analysis	en
dc.subject	Bayesian Model Selection	en
dc.subject	Microarray data	en
dc.subject	Single-nucleotide polymorphism (SNP) data	en
dc.title	Bayesian Model Selection for High-dimensional High-throughput Data	en
dc.type	Thesis	en
thesis.degree.department	Statistics	en
thesis.degree.discipline	Statistics	en
thesis.degree.grantor	Texas A&M University	en
thesis.degree.name	Doctor of Philosophy	en
thesis.degree.level	Doctoral	en
dc.contributor.committeeMember	Hu, Jianhua
dc.contributor.committeeMember	Broom, Bradley M.
dc.contributor.committeeMember	Ivanov, Ivan V.
dc.type.genre	thesis	en
dc.type.material	text	en

Files in this item

Name:: JOSHI-DISSERTATION.pdf
Size:: 1.180Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Electronic Theses, Dissertations, and Records of Study (2002– )
Texas A&M University Theses, Dissertations, and Records of Study (2002– )

Show simple item record