Semiparametric Analysis of Complex Polygenic Gene-Environment Interactions in Case-Control Studies
Abstract
Gene-environment interactions can be efficiently estimated in case-control data by existing
retrospective methods that assume gene-environment independence in the source population, but
such techniques require parametric modeling of the genetic variables. Standard logistic regression
analysis of case-control data has low power to detect gene-environment interactions, but it has been
the only method capable of analyzing complex polygenic data for which parametric distributional
models are not feasible.
This dissertation proposes a general, computationally simple, semiparametric method for analysis
of case-control studies that allows exploitation of the assumption of gene-environment independence
without any further parametric modeling assumptions about the marginal distributions of
any of the two sets of factors. The method relies on the key observation that an underlying efficient
profile likelihood depends on the distribution of genetic factors only through certain expectation
terms that can be evaluated empirically.
This method is further improved by treating the genetic and environmental variables symmetrically
to generate two sets of parameter estimates that are combined to generate a more efficient
estimate. A semiparametric framework is employed to develop the asymptotic theory of the estimators,
and their performance is evaluated via simulation studies. The methods are illustrated using
data from a case-control study of breast cancer, and free software implementing both methods is
demonstrated.
Subject
Case-control studiesGene-environment interactions
Genetic epidemiology
Pseudolikelihood
Retrospective studies
Semiparametric methods
Citation
Asher, Alexander Allen (2018). Semiparametric Analysis of Complex Polygenic Gene-Environment Interactions in Case-Control Studies. Doctoral dissertation, Texas A & M University. Available electronically from https : / /hdl .handle .net /1969 .1 /174127.