Topics on the role of Cholesky factor in Learning high dimensional graphical models
Abstract
In modern multivariate statistics learning from ``Big Data'' is ubiquitous, and understanding relationships and dependencies of variables are imperative to develop learning algorithms. Unequivocally, the (inverse) covariance matrix and Bayesian Networks are the most fundamental objects that specify multivariate associations and dependencies.
Time series literature is rich in methods advocating utilization of the Cholesky factor to model temporal dependence and dynamics in data. Recently, a similar movement is evolving in modern statistical and machine learning literature where the focus is on the estimation of (inverse) covariance matrices and Bayesian Networks. The main contributions in this dissertation pivot around two topics: sparsity and smoothness of the Cholesky factor.
The smoothness of subdiagonals of the Cholesky factor of a large covariance matrix is closely related to the degree of nonstationarity of the autoregressive model for time series and longitudinal data. Heuristically, one expects for a nearly stationary covariance matrix, entries in each subdiagonal of the Cholesky factor of its inverse to be approximately the same, in the sense that the sum of absolute values of successive terms is small or can be bounded. Statistically, such smoothness is achieved by regularizing each subdiagonal using fused-type lasso penalties. In Chapter 2, we rely on the Cholesky factor as the new parameter within a regularized normal likelihood setup which guarantees: (1) joint convexity of the likelihood function, (2) strict convexity of the likelihood function restricted to each subdiagonal even when n < p, and (3) positive-definiteness of the estimated covariance matrix. A block coordinate descent algorithm, where each block is a subdiagonal, is proposed, and its convergence is established under mild conditions. Simulation results and real data analysis show the scope and good performance of the proposed methodology.
In Chapter 3, we propose an algorithm to learn Gaussian Bayesian Networks. The impetus of our work is the observation that the Cholesky factor of the inverse covariance matrix entails the structure of a directed acyclic graph (DAGs) when the ordering of variables is known. However, the combinatorial problem of learning the order of variables in DAGs is NP-hard and computationally infeasible for high-dimensional problems. We introduce the permutation matrix as a new parameter within a regularized Gaussian log-likelihood to estimate variable ordering. The proposed algorithm iteratively learns DAGs by optimizing the regularized likelihood function over the set of permutation and lower triangular matrices. First, by relaxation, it finds the permutation matrix, and then for a given ordering estimates a sparse Cholesky factor by decoupling row-wise. The convergence and statistical properties of the algorithm in each step are established under mild conditions. We use our methodology to analyze a macro-economic dataset.
Citation
Dallakyan, Aramayis (2021). Topics on the role of Cholesky factor in Learning high dimensional graphical models. Doctoral dissertation, Texas A&M University. Available electronically from https : / /hdl .handle .net /1969 .1 /195808.