Show simple item record

dc.contributor.advisorYoon, Byung-Jun
dc.contributor.advisorQian, Xiaoning
dc.creatorGadiyaram, Manasa
dc.date.accessioned2023-02-07T16:20:42Z
dc.date.available2024-05-01T06:06:24Z
dc.date.created2022-05
dc.date.issued2022-04-19
dc.date.submittedMay 2022
dc.identifier.urihttps://hdl.handle.net/1969.1/197363
dc.description.abstractLong non-coding RNA’s(lncRNA’s) are a type of RNA transcripts with a length of more than 200 nucleotides which cannot be translated into proteins. The study of lncRNAs is extremely important since it has been discovered that a wide range of biological processes are affected by them, such as epigenetic regulation, metabolic processes, chromosome dynamics and cell differentiation. This work investigates the classification of lncRNA’s from protein coding transcripts (PCT’s) using a multi-stage high throughput virtual screening (HTVS) pipeline. Each stage of the pipeline is a support vector machine (SVM) model. Various features associated with RNAs in general have been calculated. These features are divided into three groups- sequence based, secondary structure based and physicochemical property based. These features were first calculated and analyzed in a method called LncFinder. Support vector machines have been trained on these features on the basis of complexity and time taken for calculation. Support-vector machines are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. These SVM’s have then been arranged on an HTVS pipeline as different stages of the pipeline. The pipeline has been optimized using an optimization framework, for determining the screening thresholds of each stage of the HTS pipeline. The final number of lncRNA’s obtained can then be further used for drug discovery purposes. This multi-stage classification process significantly reduces the effective selection cost per potential candidate and make the HTS pipelines less sensitive to their structural variations.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectLong non-coding RNA's
dc.subjectHigh throughput virtual screening pipelines
dc.subjectOptimization
dc.titleConstruction of an Optimized Multi-Stage High-Throughput Virtual Screening Pipeline for Long Non-Coding RNA's
dc.typeThesis
thesis.degree.departmentElectrical and Computer Engineering
thesis.degree.disciplineElectrical Engineering
thesis.degree.grantorTexas A&M University
thesis.degree.nameMaster of Science
thesis.degree.levelMasters
dc.contributor.committeeMemberNarayanan, Krishna
dc.contributor.committeeMemberJayaraman, Arul
dc.type.materialtext
dc.date.updated2023-02-07T16:20:43Z
local.embargo.terms2024-05-01
local.etdauthor.orcid0000-0003-3597-8968


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record