Show simple item record

dc.contributor.advisorDatta, Aniruddha
dc.creatorGuo, Zhengyu
dc.date.accessioned2019-01-16T19:29:05Z
dc.date.available2019-12-01T06:33:23Z
dc.date.created2017-12
dc.date.issued2017-11-02
dc.date.submittedDecember 2017
dc.identifier.urihttps://hdl.handle.net/1969.1/173122
dc.description.abstractHigh-throughput sequencing has become one of the most powerful tools for studies in genomics, transcriptomics, epigenomics, and metagenomics. In recent years, HTS protocols for enhancing the understanding of the diverse cellular roles of RNA have been designed, such as RNA-Seq, CLIP-Seq, and RIP-Seq. In this work, we explore the applications of HTS data analysis in transcriptional studies. First, the differential expression analysis of RNA-Seq data is discussed and applied to a sheep RNA-Seq dataset to examine the biological mechanisms of the sheep resistance to worm infection. We develop an automatic pipeline to analyze the RNA-Seq dataset, and use a negative binomial model for gene expression analysis. Functional analysis is conducted over the differentially expressed genes, and a broad range of mechanisms providing protection against the parasite are identified in the resistant sheep breed. This study provides insights into the underlying biology of sheep host resistance. Then, a deep learning method is proposed to predict the RNA binding protein binding preferences using CLIP-Seq data. The proposed method uses a deep convolutional autoencoder to effectively learn the robust sequence features, and a softmax classifier to predict the RBP binding sites. To demonstrate the efficacy of the proposed method, we evaluate its performance over a dataset containing 31 CLIP-Seq experiments. This benchmarking shows that the proposed method improves the prediction performance in terms of AUC, compared with the existing methods. The analysis also shows that the proposed method is able to provide insights to identify new RBP binding motifs. Therefore, the proposed method will be of great help in understanding the dynamic regulations of RBPs in various biological processes and diseases. Finally, a database is created to facilitate the reuse of the public available mouse RNA-Seq dataset. The metadata of the publicly available mouse RNA-Seq datasets is manually curated and is served by a well-designed website. The database can be scaled up in the future to serve more types of HTS data.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectHigh-Throughput Sequencingen
dc.subjectRNA-Seqen
dc.subjectDifferential expressionen
dc.subjectDeep learningen
dc.subjectDatabaseen
dc.titleApplications of High-Throughput Sequencing Data Analysis in Transcriptional Studiesen
dc.typeThesisen
thesis.degree.departmentElectrical and Computer Engineeringen
thesis.degree.disciplineElectrical Engineeringen
thesis.degree.grantorTexas A & M Universityen
thesis.degree.nameDoctor of Philosophyen
thesis.degree.levelDoctoralen
dc.contributor.committeeMemberBraga-Neto, Ulisses
dc.contributor.committeeMemberQian, Xiaoning
dc.contributor.committeeMemberDabney, Alan
dc.type.materialtexten
dc.date.updated2019-01-16T19:29:06Z
local.embargo.terms2019-12-01
local.etdauthor.orcid0000-0002-7990-6053


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record