Show simple item record

dc.contributor.advisorBraga-Neto, Ulisses M.
dc.creatorArslan, Emre
dc.date.accessioned2019-01-23T20:35:01Z
dc.date.created2018-12
dc.date.issued2018-11-09
dc.date.submittedDecember 2018
dc.identifier.urihttp://hdl.handle.net/1969.1/174506
dc.description.abstractStatistical analysis of high-dimensional biological data is the central component of “personalized medicine” and “translational bioinformatics.” Two major barriers limit the application of the extracted information in clinical studies. These barriers are small sample size and lack of biological interpretability due to the complex classification boundaries of current algorithms. Motivated in removing these barriers, we focus in this dissertation to introduce novel statistical analysis algorithms of high-dimensional biological data. We first introduce a novel predictive model. In particular, we extend the top-scoring pair algorithm to a Bayesian setting. We test the performance on several real datasets and various simulated data scenarios and show the proposed method has the best overall performance. Besides having high accuracy rates on real and simulated data sets, the proposed algorithm has the potential to discover gene markers that may be missed via other algorithms. We also suggested the Bayesian Top-Scoring Pair (BTSP) as a feature selection method. We compared the proposed algorithm with many well-known feature selection methods by combining the feature selection methods with different well-known classifiers. We checked the performance of all feature selection methods for different data sets and for different numbers of genes. The proposed BTSP algorithm has the best overall accuracy rates. Finally, we introduce a novel biological pathway data-based algorithm (BTSPP). This algorithm uses all pairwise interactions in the gene level and pathway level. We apply the proposed method and well-known pathway data-based algorithms to different real data sets and check performances in terms of accurately classifying independent test sets and show the proposed algorithm superiority. We also checked the ability to find the biologically validated pathways related with diseases of these pathway data-based algorithms, over-representation analysis (ORA), and gene set enrichment analysis (GSEA). The proposed pathway analysis method has the potential to find the biologically validated pathways, whereas the others cannot detect the biologically validated pathways.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectGene Expression Classificationen
dc.subjectBayesian Methodsen
dc.subjectTSPen
dc.titleA Novel Bayesian Rank-Based Framework for the Classification of High-Dimensional Biological Dataen
dc.typeThesisen
thesis.degree.departmentElectrical and Computer Engineeringen
thesis.degree.disciplineElectrical Engineeringen
thesis.degree.grantorTexas A & M Universityen
thesis.degree.nameDoctor of Philosophyen
thesis.degree.levelDoctoralen
dc.contributor.committeeMemberDougherty, Edward
dc.contributor.committeeMemberSerpedin, Erchin
dc.contributor.committeeMemberDabney, Alan
dc.type.materialtexten
dc.date.updated2019-01-23T20:35:01Z
local.embargo.terms2020-12-01
local.embargo.lift2020-12-01
local.etdauthor.orcid0000-0003-4978-3715


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record