The full text of this item is not available at this time because the student has placed this item under an embargo for a period of time. The Libraries are not authorized to provide a copy of this work during the embargo period, even for Texas A&M users with NetID.
Reproducing Kernel Hilbert Space Modeling in Functional Data Analysis and Causal Inference
Abstract
Reproducing kernel Hilbert space (RKHS) is a popular modeling approach for nonparametric
function estimations. This dissertation demonstrates how to incorporate nice properties of RKHS in constructing estimators that have appealing theoretical performance and are computationally feasible, in the areas of functional data analysis and causal inference. Three main projects are included.
The first project studies the low-rank covariance function estimation for multidimensional functional data. Multidimensional function data arise from many fields nowadays. The covariance function plays an important role in the analysis of such increasingly common data. In this dissertation, we propose a novel nonparametric covariance function estimation approach under the framework of RKHS that can handle both sparse and dense functional data. We extend multilinear rank structures for (finite-dimensional) tensors to functions, which allow for flexible modeling of both covariance operators and marginal structures. The proposed framework can guarantee that the resulting estimator is automatically semi-positive definite, and can incorporate various spectral regularizations. The trace-norm regularization in particular can promote low ranks for both covariance operator and marginal structures. Despite the lack of a closed form, under mild assumptions, the proposed estimator can achieve unified theoretical results that hold for any relative magnitudes between the sample size and the number of observations per sample field, and the rate of convergence reveals the “phase-transition” phenomenon from sparse to dense functional data. Based on a new representer theorem, an ADMM algorithm is developed for the trace-norm regularization. The appealing numerical performance of the proposed estimator is demonstrated by a simulation study and the analysis of a dataset from the Argo project.
In the second project, we study nonparametric estimation for the partially conditional average
treatment effect, defined as the treatment effect function over an interested subset of confounders. We propose a hybrid kernel weighting estimator where the weights aim to control the balancing error of any function of the confounders from an RKHS after kernel smoothing over the subset of interested variables. In addition, we present an augmented version of our estimator which can incorporate estimations of outcome mean functions. Based on the representer theorem, gradient-based algorithms can be applied for solving the corresponding infinite-dimensional optimization problem. Asymptotic properties are studied without any smoothness assumptions for the propensity score function or the need of data splitting, relaxing certain existing stringent assumptions. The numerical performance of the proposed estimator is demonstrated by a simulation study and an application to the effect of a mother’s smoking on a baby’s birth weight conditioned on the mother’s age.
The last project is an interaction between functional data analysis and causal inference. Motivated by the physical activity data, we study the continuous treatment effect estimation for functional treatments. Unlike discrete or continuous variables, the density for the functional variable is not well-established due to its infinite-dimensional intrinsity. We generalize the covariate balancing idea investigated in my second project, and develop a novel weighted estimator under the RKHS modeling assumption for the target treatment effect function whose input is also a function. Balancing weights are constructed to minimize the distance between the solution of kernel ridge regression with weighted responses and the target effect function for any “potential” outcome mean function in a tensor-product RKHS. Theoretical results are developed under the empirical norm metric. We show that the proposed estimator can achieve the optimal convergence rate without any smoothness assumptions for the true weight function. The appealing numerical performance of the proposed estimator is demonstrated by a simulation study and an application to the effect of activity profile on people’s BMI value using the physical activity data.
Subject
Reproducing kernel Hilbert spacecovariance function
multidimensional functional data
causal inference
covariate balancing
conditional average treatment effect
functional treatment
Citation
Wang, Jiayi (2022). Reproducing Kernel Hilbert Space Modeling in Functional Data Analysis and Causal Inference. Doctoral dissertation, Texas A&M University. Available electronically from https : / /hdl .handle .net /1969 .1 /197882.