Show simple item record

dc.contributor.advisorWong, Raymond K. W.
dc.creatorLi, Jiangyuan
dc.date.accessioned2023-10-12T13:51:14Z
dc.date.available2023-10-12T13:51:14Z
dc.date.created2023-08
dc.date.issued2023-07-18
dc.date.submittedAugust 2023
dc.identifier.urihttps://hdl.handle.net/1969.1/199763
dc.description.abstractModern machine learning tasks often involve the training of over-parameterized models and the challenge of addressing data bias. However, despite recent advances, there remains a significant knowledge gap in these areas. This thesis aims to push the boundaries of our understanding by exploring the implicit bias of neural network training and proposing strategies for mitigating data bias in matrix completion. In the first result, we study the implicit regularization of gradient descent on a diagonally linear neural network with general depth-N under a realistic setting of noise and correlated designs. We characterize the impact of depth and early stopping and show that for a general depth parameter N, gradient descent with early stopping achieves minimax optimal sparse recovery with sufficiently small initialization and step size. In particular, we show that increasing depth enlarges the scale of working initialization and the early-stopping window so that this implicit sparse regularization effect is more likely to take place. Continuing our exploration of implicit bias, our second main result introduces a novel neural reparametrization known as the “diagonally grouped linear neural network”. This reparametriza-tion exhibits a fascinating property wherein gradient descent, operating on the squared regression loss without explicit regularization, biases towards solutions with a group sparsity structure. In contrast to many existing works in understanding implicit regularization, we prove that our train-ing trajectory cannot be simulated by mirror descent. Compared to existing bounds for implicit sparse regularization using diagonal linear networks, our analysis with the new reparameterization shows improved sample complexity in the general noise setting. In our third result, we propose a pseudolikelihood approach for matrix completion with in-formative missing. We focus on a flexible and generally applicable missing mechanism, which contains both ignorable and nonignorable missing as special cases. We show that the regularized pairwise pseudolikelihood estimator can recover the low-rank matrix up to a constant shift and scaling while effectively mitigating the impact of data bias.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectlearning algorithms
dc.subjectlinear neural networks
dc.subjectimplicit bias
dc.subjectmatrix completion
dc.subjectinformative missing
dc.titleLearning Under Implicit Bias and Data Bias
dc.typeThesis
thesis.degree.departmentStatistics
thesis.degree.disciplineStatistics
thesis.degree.grantorTexas A&M University
thesis.degree.nameDoctor of Philosophy
thesis.degree.levelDoctoral
dc.contributor.committeeMemberPati, Debdeep
dc.contributor.committeeMemberZhang, Xianyang
dc.contributor.committeeMemberZhang, Ke
dc.type.materialtext
dc.date.updated2023-10-12T13:51:15Z
local.etdauthor.orcid0000-0001-9983-1119


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record