Show simple item record

dc.contributor.advisorYu, Peng
dc.creatorLi, Jin
dc.date.accessioned2019-01-23T19:45:19Z
dc.date.created2018-12
dc.date.issued2018-10-03
dc.date.submittedDecember 2018
dc.identifier.urihttp://hdl.handle.net/1969.1/174429
dc.description.abstractA large volume of gene expression data is being generated for studying mechanisms of various biological processes. These precious data enabled various computational analyses to speed up the understanding of biological knowledge. However, it remains a challenge to analyze the data efficiently for new knowledge mining. These data were generated for different purposes, and their heterogeneity makes it difficult to consistently integrate the datasets, slowing down the reuse of these data and the process of biological discovery for new knowledge. To facilitate the reuse of these precious data, we engaged biology experts to manually collected RNA-Seq gene expression datasets for perturbed splicing factors and RNA-binding proteins, resulting in two online databases, SFMetaDB and RBPMetaDB. These two databases hold comprehensive RNA-Seq gene expression data for mouse splicing factors and RNA-binding proteins, and they can be used for identify key genes or regulators in biological processes or human diseases. Beside showing an importance of two databases, these two projects also demonstrated an efficient way to collect data. In my dissertation, we also engaged biology collaborators to collect comprehensive regulate genes in cold-induced thermogenesis supported by in vivo experiments with key genes deposited to CITGeneDB. This database is the first to offer comprehensive list of regulators in cold-induced thermogenesis in a higher regulatory hierarchy. In addition to build data resources, my dissertation also worked on analyze RNA-Seq gene expression data to gain biological insights. To study the mechanism of human skin disease psoriasis, we analyzed mouse and human public psoriasis datasets, and compared to splicing factor perturbed datasets in SFMetaDB, resulting in candidate genes for psoriasis. Our computational predictions provide candidate factors to follow to study fundamental processes underlying psoriasis. In addition, we introduced a data processing paradigm to identify key genes in biological processes via systematic collection of gene expression datasets, primary analysis of data, and evaluation of consistent signals. Our paradigm was applied to two applications of epidermal development and cold-induced thermogenesis, and revealed many key genes in the two applications. By collaborating with web labs, we experimentally validate a novel gene suprabasin (SBSN) in epidermal development. These findings enable a better understanding of the mechanisms underlying epidermal development and cold-induced thermogenesis, and also demonstrate the effectiveness of our paradigm by combining data collection and integrated analysis. My dissertation has mainly investigated a biological data process paradigm, consisting of systematic data collection, data analysis and hypothesis generation. By intensive works, we demonstrated the effectiveness of this novel biological data process approach, and this approach can be readily generalized to other biological processes or human diseases.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectData miningen
dc.subjectGene expressionen
dc.titleData Mining for Identifying Key Genes in Biological Processes Using Gene Expression Dataen
dc.typeThesisen
thesis.degree.departmentElectrical and Computer Engineeringen
thesis.degree.disciplineElectrical Engineeringen
thesis.degree.grantorTexas A & M Universityen
thesis.degree.nameDoctor of Philosophyen
thesis.degree.levelDoctoralen
dc.contributor.committeeMemberBraga-Neto, Ulisses
dc.contributor.committeeMemberCai, James
dc.contributor.committeeMemberHu, Jiang
dc.type.materialtexten
dc.date.updated2019-01-23T19:45:19Z
local.embargo.terms2020-12-01
local.embargo.lift2020-12-01
local.etdauthor.orcid0000-0003-0595-8309


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record