Show simple item record

dc.contributor.advisorZhang, Xianyang
dc.creatorChakraborty, Shubhadeep
dc.date.accessioned2021-01-29T16:51:26Z
dc.date.available2022-08-01T06:53:53Z
dc.date.created2020-08
dc.date.issued2020-07-23
dc.date.submittedAugust 2020
dc.identifier.urihttps://hdl.handle.net/1969.1/192210
dc.description.abstractMeasuring and testing for independence and homogeneity of distributions are some fundamental problems in statistics, finding applications in a wide variety of areas like independent component analysis, gene selection, graphical modeling, causal inference, goodness-of-fit testing, change-point detection and so on. Szekely et al.(2007), in their seminal paper, introduced the notion of distance covariance (dCov) as a measure of dependence between two random vectors of arbitrary (but fixed) dimensions. The innovative feature of dCov is the fact that dCov between two random vectors takes the value zero if and only if they are independent, thereby completely characterizing independence between two random vectors. However, many statistical applications, such as independent component analysis, diagnostic checking for structural equation modeling, etc., require the quantification of joint independence among d>=2 random vectors, which is a quite different and more ambitious task than testing for pairwise independence of a collection of random vectors. The first work (Chapter 2) proposes a new dependence metric called the Joint Distance Covariance (JdCov) which generalizes or extends the notion of distance covariance to quantify joint dependence among d>=2 random vectors of arbitrary (but fixed) dimensions. JdCov takes the value zero if and only if the d random vectors are jointly independent, and thereby completely characterizes their joint independence. We propose empirical estimators of JdCov, study their asymptotic behaviors and consequently propose a consistent bootstrap-based nonparametric test for joint independence. The proposed dependence metrics are employed to perform model selection in causal inference, based on the joint independence testing of the residuals from the fitted structural equation models. The effectiveness of the method is illustrated via both simulated and real datasets. The second work (Chapter 3) proposes nonparametric tests for homogeneity and independence between two high-dimensional random vectors. Energy distance (proposed by Szekely and Rizzo (2004)) is a classical measure of equality of two multivariate distributions, taking the value zero if and only if the two random vectors are identically distributed. Our work shows that energy distance based on the usual Euclidean distance cannot completely characterize the homogeneity of two high-dimensional distributions in the sense that it can only detect the equality of means and the traces of covariance matrices of two high-dimensional random vectors. In other words, the classical energy distance fails to detect inhomogeneity between two high-dimensional distributions beyond the first two moments. Also it has been pointed out very recently by Zhu et al. (2019) that the classical distance covariance can only capture component-wise linear dependence between two high-dimensional random vectors. Such limitations of the classical energy distance and distance covariance arise due to the use of Euclidean distance, and we propose a new class of distance metrics for high-dimensional Euclidean spaces to overcome the drawbacks. We propose a new class of homogeneity/dependence metrics based on the new distance metrics, which inherit the desirable properties of the classical energy distance/distance covariance in the low-dimensional setting. And more importantly, in the high-dimensional setup the new metrics are capable of completely characterizing the homogeneity/independence between the low-dimensional marginal distributions, going above and beyond the scope of the classical energy distance/distance covariance. Moreover we propose t-tests based on the new metrics to perform high-dimensional two-sample testing/independence testing in a fully nonparametric framework and study their asymptotic properties. We use our methodology to analyze cross-sector independence of (high-dimensional) stock prices data. Change-point detection has been a classical problem in statistics, finding applications in a wide variety of fields. A nonparametric change-point detection procedure is concerned with detecting abrupt distributional changes in the data generating distribution, rather than only changes in mean. In the third work (Chapter 4), we consider the problem of detecting an unknown number of change-points in an independent sequence of high-dimensional observations and testing for the significance of the estimated change-point locations. Our approach essentially rests upon nonparametric tests for the homogeneity of two high-dimensional distributions. We construct a single change-point location estimator via defining a cumulative sum process in an embedded Hilbert space. As the key theoretical innovation, we rigorously derive its limiting distribution under the high dimension medium sample size (HDMSS) framework. Subsequently we combine our statistic with the idea of wild binary segmentation to recursively estimate and test for multiple change-point locations. The superior performance of our methodology compared to several other existing procedures is illustrated via both simulated and real datasets.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectChange-point Detectionen
dc.subjectCausal Inferenceen
dc.subjectHigh-dimensional Statisticsen
dc.subjectNonparametric Statisticsen
dc.titleDISTANCE AND KERNEL-BASED NONPARAMETRIC TESTS FOR INDEPENDENCE AND HOMOGENEITY OF DISTRIBUTIONS, AND THEIR APPLICATIONSen
dc.typeThesisen
thesis.degree.departmentStatisticsen
thesis.degree.disciplineStatisticsen
thesis.degree.grantorTexas A&M Universityen
thesis.degree.nameDoctor of Philosophyen
thesis.degree.levelDoctoralen
dc.contributor.committeeMemberHuang, Jianhua
dc.contributor.committeeMemberPourahmadi, Mohsen
dc.contributor.committeeMemberHuang, Ruihong
dc.type.materialtexten
dc.date.updated2021-01-29T16:51:26Z
local.embargo.terms2022-08-01
local.etdauthor.orcid0000-0001-6555-8900


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record