Show simple item record

dc.contributor.advisorGarcia, Tanya P
dc.contributor.advisorPourahmadi, Mohsen
dc.creatorKim, Rakheon
dc.date.accessioned2023-02-07T16:23:28Z
dc.date.available2023-02-07T16:23:28Z
dc.date.created2022-05
dc.date.issued2022-04-20
dc.date.submittedMay 2022
dc.identifier.urihttps://hdl.handle.net/1969.1/197394
dc.description.abstractThis dissertation discusses how we can exploit sparsity, a statistical assumption that only a small number of relationships between variables are non-zero, in the model selection for regression and covariance matrix estimation. In a linear model, the effects from the predictors to the response may vary for each individual. In this case, the purpose of model selection is not only to identify significant predictors but also to understand how their effects on the response differ by individuals. This can be cast as a model selection problem for a varying-coefficient regression. However, this is challenging when there is a pre-specified group structure among variables. We propose a novel variable selection method for a varying-coefficient regression with such structured variables. Our method is empirically shown to select relevant variables consistently. Also, our method screens irrelevant variables better than existing methods. Hence, our method leads to a model with higher sensitivity, lower false discovery rate and higher prediction accuracy than the existing methods. We apply this method to the Huntington disease study and find that the effects from the brain regions to motor impairment differ by disease severity of the patients, indicating the need for customized intervention. In covariance matrix estimation, current approaches to introduce sparsity do not guarantee positive definiteness or asymptotic efficiency. For multivariate normal distributions, we construct a positive definite and asymptotically efficient estimator when the location of the zero entries is known. If the location of the zero entries is unknown, we further construct a positive definite thresholding estimator by combining iterative conditional fitting with thresholding. We prove our thresholding estimator is asymptotically efficient with probability tending to one. In simulation studies, we show our estimator more closely matches the true covariance and more correctly identifies the non-zero entries than competing estimators. We apply our estimator to Huntington disease and detect non-zero correlations among brain regional volumes. Such correlations are timely for ongoing treatment studies to inform how different brain regions are likely to be affected by these treatments.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectCovariance
dc.subjectHuntington disease
dc.subjectModel Selection
dc.subjectRegularization
dc.subjectSparsity
dc.subjectThresholding
dc.subjectVarying-coefficient model
dc.titleSparsity in Varying-coefficient Regression and Covariance Matrix Estimation
dc.typeThesis
thesis.degree.departmentStatistics
thesis.degree.disciplineStatistics
thesis.degree.grantorTexas A&M University
thesis.degree.nameDoctor of Philosophy
thesis.degree.levelDoctoral
dc.contributor.committeeMemberCarroll, Raymond J
dc.contributor.committeeMemberYan, Catherine H
dc.type.materialtext
dc.date.updated2023-02-07T16:23:28Z
local.etdauthor.orcid0000-0001-9577-7872


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record