Show simple item record

dc.contributor.advisorYoon, Byung-Jun
dc.creatorMasud, Mamoon
dc.date.accessioned2022-07-27T16:43:43Z
dc.date.available2023-12-01T09:23:14Z
dc.date.created2021-12
dc.date.issued2021-12-07
dc.date.submittedDecember 2021
dc.identifier.urihttps://hdl.handle.net/1969.1/196383
dc.description.abstractCancer is one of the leading causes of death across the world and accounted for almost 10 million deaths in the year 2020 alone. Anti-cancer drug discovery & response prediction is an arduous and time-consuming process. This work investigates the use of generative models to predict anti-cancer drug response and facilitate the discovery of new drugs, utilizing chemical structure of drugs, gene expression data and response data of anti-cancer drugs. Autoencoders are a type of neural network that are trained to learn a lower dimensional representation of a high dimensional data, while a Variational Autoencoder (VAE) is trained to model the data as a distribution over the latent space. The proposed approach models the anti-cancer drug molecular data using a rectified junction tree VAE (JTVAE) model while the cancer cell lines’ gene expression data is encoded using a VAE. A feed-forward artificial neural network takes in the concatenated encoded latent vectors of gene expression profile for a cancer cell line and latent vectors of drugs to generate a final prediction, represented by the ln(IC₅₀) score. The model was trained on three different datasets, with one set consisting of breast cancer cell lines, another of central nervous system cell lines and one consisting of pan-cancer cell lines. Testing on pan-cancer cell lines produced a high average coefficient of determination (R2 = 0.875), and it was the best performing model. Additionally, this work has investigated the optimizing the latent space using weighted retraining. This was done to improve the sample efficiency. The technique adopted was sample efficient and weighted data points based on the blood brain barrier permeability. A Message Passing Neural Network (MPNN) based predictor was trained to predict blood brain barrier permeability score in the input data. After training the generative model (JTVAE), the objective function was optimized over the learned latent space using a surrogate model. This decoupled the tasks of training the generative model, and optimizing it. Weighting the data distribution, and periodic retraining of the generative model improved the prediction score of Blood Brain Barrier permeability, which is an important factor for drugs that are targeted at Central Nervous Systems tumours.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectVariational Auto-Encoder
dc.subjectCancer Drug Response
dc.subjectGenerative Models
dc.subjectAnti-Cancer Drug Discovery
dc.subjectLatent Space Optimization
dc.titleDevelopment of an Integrated Machine Learning Approach for Anti-Cancer Drug Discovery Using Variational Auto-Encoder
dc.typeThesis
thesis.degree.departmentElectrical and Computer Engineering
thesis.degree.disciplineElectrical Engineering
thesis.degree.grantorTexas A&M University
thesis.degree.nameMaster of Science
thesis.degree.levelMasters
dc.contributor.committeeMemberPark, Hangue
dc.contributor.committeeMemberZou, Jun
dc.contributor.committeeMemberTian, Limei
dc.type.materialtext
dc.date.updated2022-07-27T16:43:43Z
local.embargo.terms2023-12-01
local.etdauthor.orcid0000-0002-1614-4346


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record