Show simple item record

dc.contributor.advisorWong, Raymond
dc.creatorWang, Jiayi
dc.date.accessioned2023-05-26T17:51:42Z
dc.date.created2022-08
dc.date.issued2022-06-13
dc.date.submittedAugust 2022
dc.identifier.urihttps://hdl.handle.net/1969.1/197882
dc.description.abstractReproducing kernel Hilbert space (RKHS) is a popular modeling approach for nonparametric function estimations. This dissertation demonstrates how to incorporate nice properties of RKHS in constructing estimators that have appealing theoretical performance and are computationally feasible, in the areas of functional data analysis and causal inference. Three main projects are included. The first project studies the low-rank covariance function estimation for multidimensional functional data. Multidimensional function data arise from many fields nowadays. The covariance function plays an important role in the analysis of such increasingly common data. In this dissertation, we propose a novel nonparametric covariance function estimation approach under the framework of RKHS that can handle both sparse and dense functional data. We extend multilinear rank structures for (finite-dimensional) tensors to functions, which allow for flexible modeling of both covariance operators and marginal structures. The proposed framework can guarantee that the resulting estimator is automatically semi-positive definite, and can incorporate various spectral regularizations. The trace-norm regularization in particular can promote low ranks for both covariance operator and marginal structures. Despite the lack of a closed form, under mild assumptions, the proposed estimator can achieve unified theoretical results that hold for any relative magnitudes between the sample size and the number of observations per sample field, and the rate of convergence reveals the “phase-transition” phenomenon from sparse to dense functional data. Based on a new representer theorem, an ADMM algorithm is developed for the trace-norm regularization. The appealing numerical performance of the proposed estimator is demonstrated by a simulation study and the analysis of a dataset from the Argo project. In the second project, we study nonparametric estimation for the partially conditional average treatment effect, defined as the treatment effect function over an interested subset of confounders. We propose a hybrid kernel weighting estimator where the weights aim to control the balancing error of any function of the confounders from an RKHS after kernel smoothing over the subset of interested variables. In addition, we present an augmented version of our estimator which can incorporate estimations of outcome mean functions. Based on the representer theorem, gradient-based algorithms can be applied for solving the corresponding infinite-dimensional optimization problem. Asymptotic properties are studied without any smoothness assumptions for the propensity score function or the need of data splitting, relaxing certain existing stringent assumptions. The numerical performance of the proposed estimator is demonstrated by a simulation study and an application to the effect of a mother’s smoking on a baby’s birth weight conditioned on the mother’s age. The last project is an interaction between functional data analysis and causal inference. Motivated by the physical activity data, we study the continuous treatment effect estimation for functional treatments. Unlike discrete or continuous variables, the density for the functional variable is not well-established due to its infinite-dimensional intrinsity. We generalize the covariate balancing idea investigated in my second project, and develop a novel weighted estimator under the RKHS modeling assumption for the target treatment effect function whose input is also a function. Balancing weights are constructed to minimize the distance between the solution of kernel ridge regression with weighted responses and the target effect function for any “potential” outcome mean function in a tensor-product RKHS. Theoretical results are developed under the empirical norm metric. We show that the proposed estimator can achieve the optimal convergence rate without any smoothness assumptions for the true weight function. The appealing numerical performance of the proposed estimator is demonstrated by a simulation study and an application to the effect of activity profile on people’s BMI value using the physical activity data.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectReproducing kernel Hilbert space
dc.subjectcovariance function
dc.subjectmultidimensional functional data
dc.subjectcausal inference
dc.subjectcovariate balancing
dc.subjectconditional average treatment effect
dc.subjectfunctional treatment
dc.titleReproducing Kernel Hilbert Space Modeling in Functional Data Analysis and Causal Inference
dc.typeThesis
thesis.degree.departmentStatistics
thesis.degree.disciplineStatistics
thesis.degree.grantorTexas A&M University
thesis.degree.nameDoctor of Philosophy
thesis.degree.levelDoctoral
dc.contributor.committeeMemberCarroll, Raymond
dc.contributor.committeeMemberGaynanova, Irina
dc.contributor.committeeMemberChaspari, Theodora
dc.type.materialtext
dc.date.updated2023-05-26T17:51:42Z
local.embargo.terms2024-08-01
local.embargo.lift2024-08-01
local.etdauthor.orcid0000-0003-0807-6192


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record