Show simple item record

dc.contributor.advisorHuang, Jianhua
dc.contributor.advisorHu, Jianhua
dc.creatorFeng, Shuo
dc.date.accessioned2015-02-05T17:24:04Z
dc.date.available2016-08-01T05:30:25Z
dc.date.created2014-08
dc.date.issued2014-06-24
dc.date.submittedAugust 2014
dc.identifier.urihttps://hdl.handle.net/1969.1/153318
dc.description.abstractWe develop a new way of thinking about and integrating gene expression data (continuous) and genomic information data (binary) by jointly compressing the two data sets and embedding their signals in low dimensional feature spaces with an information sharing mechanism, which connects the continuous data to the binary data, under the penalized log-likelihood framework. In particular, the continuous data are modeled by a Gaussian likelihood and the binary data are modeled by a Bernoulli likelihood which is formed by transforming the feature space of the genomic information with a logit link. The smoothly clipped absolute deviation (SCAD) penalty, is added on the basis vectors of the low dimensional feature spaces for both data sets, which is based on the assumption that only a small set of genetic variants are associated with a small fraction of gene expression and the fact that those basis vectors can be interpreted as weights assigned on the genetic variants and gene expression similar to the way the loading vectors of principal component analysis (PCA) or canonical correlation analysis (CCA) are interpreted. Algorithmically, a Majorization-Minimization (MM) algorithm with local linear approximation (LLA) to SCAD penalty is developed to effectively and efficiently solve the optimization problem involved, which produces closed-form updating rules. The effectiveness of our method is demonstrated by simulations in various setups with comparisons to some popular competing methods and an application to eQTL mapping with real data.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectData integrationen
dc.subjecteQTLen
dc.subjectGWASen
dc.subjectCCAen
dc.titleA Likelihood Based Framework for Data Integration with Application to eQTL Mappingen
dc.typeThesisen
thesis.degree.departmentStatisticsen
thesis.degree.disciplineStatisticsen
thesis.degree.grantorTexas A & M Universityen
thesis.degree.nameDoctor of Philosophyen
thesis.degree.levelDoctoralen
dc.contributor.committeeMemberWu, Guoyao
dc.contributor.committeeMemberSherman, Michael
dc.type.materialtexten
dc.date.updated2015-02-05T17:24:04Z
local.embargo.terms2016-08-01
local.etdauthor.orcid0000-0003-0688-9122


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record