Show simple item record

dc.contributor.advisorHou, I-Hong
dc.contributor.advisorShen, Yang
dc.creatorWu, Di
dc.date.accessioned2019-01-23T22:01:40Z
dc.date.available2020-12-01T07:34:00Z
dc.date.created2018-12
dc.date.issued2018-11-30
dc.date.submittedDecember 2018
dc.identifier.urihttps://hdl.handle.net/1969.1/174625
dc.description.abstractResidue coevolution refers to a biological assumption that residue pairs covary during evolution if they form a contact within a protein or across a protein-protein interface. Under this assumption, such covariance can be used to predict residue contacts within or between protein sequences. The increasing availability of protein sequence data allows for wider applicability and also demand more accurate approaches. Current methods are modeling sequence data in Markov random fields and use maximum likelihood estimations to infer residue contacts. They mainly target the accuracy of contact prediction under the promise that more accurate 2D contact prediction helps to get a better 3D structure. This is correct but not the whole picture since patterns of predicted 2D contacts also play a significant impact on 3D structure reconstruction. For example, contacts between long-distance residue pairs in general help more than adjacent residue pairs do. Moreover, current methods always get predictions that focus on certain area. To directly target 3D structure predictions, we introduce a new method which exploits more types of data, such as secondary structure data and folds type information, to characterize the desired sparsity patterns of contact prediction in a biologically meaningful way. It then uses multiple structured sparsity regularization models, including group LASSO and group dispersive sparsity, to enforce such sparsity patterns. This method benefits from the consideration and promotion of structured sparsity, which contributes to improvement of 3D structure prediction.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectProtein contact predictionen
dc.subjectMachine learningen
dc.subjectDispersive sparsity learningen
dc.titleStructured Sparsity Learning for Coevolution-Based Protein Contact Predictionen
dc.typeThesisen
thesis.degree.departmentElectrical and Computer Engineeringen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorTexas A & M Universityen
thesis.degree.nameMaster of Scienceen
thesis.degree.levelMastersen
dc.contributor.committeeMemberHu, Jiang
dc.contributor.committeeMemberXia Hu
dc.type.materialtexten
dc.date.updated2019-01-23T22:01:41Z
local.embargo.terms2020-12-01
local.etdauthor.orcid0000-0003-2002-2883


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record