Show simple item record

dc.contributor.advisorCarroll, Raymond J
dc.creatorLiang, Liang
dc.date.accessioned2017-08-21T14:38:26Z
dc.date.available2019-05-01T06:11:25Z
dc.date.created2017-05
dc.date.issued2017-05-01
dc.date.submittedMay 2017
dc.identifier.urihttps://hdl.handle.net/1969.1/161456
dc.description.abstractAs a cost-efficient alternative to cohort design, case-control design is widely used in epidemiological studies. The primary analysis of the case-control studies focuses on the relationship between disease status and the potential risk factors, while the secondary analysis lies in analyzing the interrelationship between risk factors. The dissertation considers three semiparametric models arose in primary and ki secondary analysis of case-control studies and develops novel semiparametric estimators with great estimation efficiency. We first investigate a special primary analysis problem, the gene-environment interaction model under independence assumption. While all existing approaches that exploit gene-environment independence assumption rely on a rare disease assumption or/and a distributional assumption on the genetic variable, we allow the disease rate and the distributions of the genetic and environmental variables in the underlying source population to be unknown. Under such a flexible semiparametric model, we derive the semiparametric efficient estimator and show that it outperformed the prospective logistic regression, the standard approach in primary analysis, through various numerical illustrations. In the secondary conditional mean regression model, we analyze the interrelationship between covariates while only a conditional mean model is specified. Due to the unknown error distribution and the case-control nature of the data, semiparametric efficient estimation requires multivariate nonparametric regression on various quantities, which meets the curse of dimensionality as the dimension of covariates increases. We bypass this problem by devising a dimension reduction approach. The resulting estimator is robust against the misspecification of the regression error distribution and it shows great efficiency gain over several existing methods. Lastly, we consider a secondary conditional quantile regression problem, which is a more preferable model in epidemiology when high or low values in the population are associated with high risks. Under a semiparametric framework that allows the covariates distribution to be nonparametric, we derive a class of consistent semiparametric estimators and spot the efficient member. The resulting estimator dominates the weighted estimating equation approach, the only published approach on secondary quantile regression, both theoretically and numerically.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectBiased samplesen
dc.subjectCase-control studyen
dc.subjectGene-environment interactionen
dc.subjectPrimary Analysisen
dc.subjectSecondary analysisen
dc.subjectSemiparametric estimationen
dc.subjectHeteroscedastic errorsen
dc.subjectQuantile regressionen
dc.titleSemiparametric Efficient Estimators in Primary and Secondary Analysis of Case-Control Studiesen
dc.typeThesisen
thesis.degree.departmentStatisticsen
thesis.degree.disciplineStatisticsen
thesis.degree.grantorTexas A & M Universityen
thesis.degree.nameDoctor of Philosophyen
thesis.degree.levelDoctoralen
dc.contributor.committeeMemberHart, Jeffrey
dc.contributor.committeeMemberPourahmadi, Mohsen
dc.contributor.committeeMemberLi, Qi
dc.type.materialtexten
dc.date.updated2017-08-21T14:38:26Z
local.embargo.terms2019-05-01
local.etdauthor.orcid0000-0002-9509-7727


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record