Show simple item record

dc.contributor.advisorQian, Xiaoning
dc.creatorHe, Kai
dc.date.accessioned2021-02-02T20:18:37Z
dc.date.available2022-08-01T06:53:41Z
dc.date.created2020-08
dc.date.issued2020-07-01
dc.date.submittedAugust 2020
dc.identifier.urihttps://hdl.handle.net/1969.1/192309
dc.description.abstractLongitudinal studies that employ repeated measures by following particular individuals over periods of time allow researchers to assess the development/changes in the characteristics of the target populations at both the group and the individual level. Early detection and feature learning with longitudinal data benefits society in many areas such as early clinical diagnosis, process monitoring, manufacturing and social security. Critical to understanding the dynamic patterns in general, is the capability of detecting and tracking the progression of the events of interest as well as identifying the event-associated factors. This dissertation addresses some of the critical issues concerning early detection, robust feature derivation and variable selection in longitudinal data analysis. Early prediction of disease onset and contemporaneous monitoring of the disease-induced progression can be tremendous help before it has time to fully take hold and can help patients get more appropriate care and treatments. However, accurate prediction and risk estimation of the disease onset remains challenging, due to the facts that the disease patterns are often indistinguishable at the early stage, and the longitudinal data can be irregularly spaced, missing and not fully labeled. To address these issues, we have developed a contemporaneous disease risk detector, called EDRA (Early Detection and Risk Assessment), a flexible learning framework based on Structured-Output Support Vector Machine (SOSVM) technique to incorporate the individual-level progression. The performance of EDRA is assessed via the datasets of varying complexity, which demonstrates its capability of early prediction of disease onset and risk estimation in terms of detection’s earliness and accuracy with partially labeled longitudinal data. Along with the challenges from early detection and risk monitoring, the rapid advancement of high-throughput profiling and imaging technologies in recent decades produce biomedical data of high dimensionality, which highlights the importance for extracting predictive features for accurate disease diagnosis and prognosis as well as identifying variables of interest to enable targeted predictive interventions and treatments. However, unwanted data variability, including inherent “batch effects”, could be harmful with biased analytical results, and is commonly observed in data collected across multiple experiments or studies. We have developed a principle component analysis (PCA)-based framework, namely MSSPCA (Matched Supervised Sparse PCA) for robust feature learning by involving the data heterogeneity. MSSPCA has superior performance in deriving predictive features with variable selection capability and being robust to noisy outcomes. The effectiveness of MSSPCA has been demonstrated through a simulation study and a real-world case study with comprehensive performance comparison with several representative and popular existing methods. Finally, we propose a pipeline that integrates EDRA and MSSPCA for robust early detection. The performance of the proposed pipeline is validated through a real-world longitudinal RNA-Seq data for tuberculosis early prediction. In summary, our proposed methods enhance the performance for longitudinal data analysis in terms of the improved accuracy and robustness for early detection, better model interpretation and facilitated learning and inference. Although their benefits are demonstrated in biomedical applications, our proposed methods can also be applied in many other domains where the longitudinal data analysis is involved.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectEarly event detectionen
dc.subjectRobust feature learningen
dc.subjectLongitudinal data analysisen
dc.titleEarly Detection and Robust Feature Learning in Longitudinal Data Analysisen
dc.typeThesisen
thesis.degree.departmentElectrical and Computer Engineeringen
thesis.degree.disciplineElectrical Engineeringen
thesis.degree.grantorTexas A&M Universityen
thesis.degree.nameDoctor of Philosophyen
thesis.degree.levelDoctoralen
dc.contributor.committeeMemberBraga-Neto, Ulisses
dc.contributor.committeeMemberLiu, Tie
dc.contributor.committeeMemberFigueiredo, Paul de
dc.type.materialtexten
dc.date.updated2021-02-02T20:18:38Z
local.embargo.terms2022-08-01
local.etdauthor.orcid0000-0002-5805-3354


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record