Show simple item record

dc.contributor.advisorJohnson, Valen E.
dc.contributor.advisorDahl, David B.
dc.creatorJoshi, Adarsh
dc.date.accessioned2012-07-16T15:56:11Z
dc.date.accessioned2012-07-16T20:18:02Z
dc.date.available2012-07-16T15:56:11Z
dc.date.available2012-07-16T20:18:02Z
dc.date.created2010-05
dc.date.issued2012-07-16
dc.date.submittedMay 2010
dc.identifier.urihttps://hdl.handle.net/1969.1/ETD-TAMU-2010-05-7740
dc.description.abstractBayesian methods are often criticized on the grounds of subjectivity. Furthermore, misspecified priors can have a deleterious effect on Bayesian inference. Noting that model selection is effectively a test of many hypotheses, Dr. Valen E. Johnson sought to eliminate the need of prior specification by computing Bayes' factors from frequentist test statistics. In his pioneering work that was published in the year 2005, Dr. Johnson proposed using so-called local priors for computing Bayes? factors from test statistics. Dr. Johnson and Dr. Jianhua Hu used Bayes' factors for model selection in a linear model setting. In an independent work, Dr. Johnson and another colleage, David Rossell, investigated two families of non-local priors for testing the regression parameter in a linear model setting. These non-local priors enable greater separation between the theories of null and alternative hypotheses. In this dissertation, I extend model selection based on Bayes' factors and use nonlocal priors to define Bayes' factors based on test statistics. With these priors, I have been able to reduce the problem of prior specification to setting to just one scaling parameter. That scaling parameter can be easily set, for example, on the basis of frequentist operating characteristics of the corresponding Bayes' factors. Furthermore, the loss of information by basing a Bayes' factors on a test statistic is minimal. Along with Dr. Johnson and Dr. Hu, I used the Bayes' factors based on the likelihood ratio statistic to develop a method for clustering gene expression data. This method has performed well in both simulated examples and real datasets. An outline of that work is also included in this dissertation. Further, I extend the clustering model to a subclass of the decomposable graphical model class, which is more appropriate for genotype data sets, such as single-nucleotide polymorphism (SNP) data. Efficient FORTRAN programming has enabled me to apply the methodology to hundreds of nodes. For problems that produce computationally harder probability landscapes, I propose a modification of the Markov chain Monte Carlo algorithm to extract information regarding the important network structures in the data. This modified algorithm performs well in inferring complex network structures. I use this method to develop a prediction model for disease based on SNP data. My method performs well in cross-validation studies.en
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.subjectBayes factorsen
dc.subjectBayes factors based on test statisticsen
dc.subjectBayesian Graphsen
dc.subjectMCMCen
dc.subjectObjective Bayesian Analysisen
dc.subjectBayesian Model Selectionen
dc.subjectMicroarray dataen
dc.subjectSingle-nucleotide polymorphism (SNP) dataen
dc.titleBayesian Model Selection for High-dimensional High-throughput Dataen
dc.typeThesisen
thesis.degree.departmentStatisticsen
thesis.degree.disciplineStatisticsen
thesis.degree.grantorTexas A&M Universityen
thesis.degree.nameDoctor of Philosophyen
thesis.degree.levelDoctoralen
dc.contributor.committeeMemberHu, Jianhua
dc.contributor.committeeMemberBroom, Bradley M.
dc.contributor.committeeMemberIvanov, Ivan V.
dc.type.genrethesisen
dc.type.materialtexten


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record