Show simple item record

dc.contributor.advisorWalker, Duncan M
dc.creatorBang, Sung Je
dc.date.accessioned2021-01-29T15:01:29Z
dc.date.available2021-01-29T15:01:29Z
dc.date.created2020-08
dc.date.issued2020-07-30
dc.date.submittedAugust 2020
dc.identifier.urihttps://hdl.handle.net/1969.1/192181
dc.description.abstractIn this research, we developed a new method of outlier detection and removal from point-based data sets utilizing deep learning. To do this, we focused on creating an outlier detection method that would tie the outlier detection procedure and a model-building process together. Using the different behaviors of outliers and inliers, we used model complexity as an indicator for outliers in data sets. In this context, “complexity” of a model means the weight of non-zero edges in the model. This include features of a model such as number of layers and number of nodes per layer. Our proposed method of using model complexity to detect outliers consists of several steps. First, a model of low complexity (low number of layers or low number of nodes per layer) should be made and trained on a data set, and its predicted values for each instance of the data set must be recorded. Second, we need to build multiple neural network models of differing number of layers or number of nodes per layer and find a group of models of specific number of layers with the best average performance values on a given data set. Performance in this context includes general classification accuracy or mean squared error values of models. Third, within the group, we pick the model with the highest number of nodes per layer and use its predictions for each instance of the data set and compare them with the predicted values of the low-complexity model from the first step. The instances with different prediction values by both models should then be labeled as outliers and thus removed. Two factors must be noted about this method. First, the lower the correlation that attributes have to the output values in a data set, the fewer outliers the method will detect. Second, the larger and more complex a data set becomes (such as having many attributes), the fewer outliers the method will find. These factors must be noted when using this method.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectoutlieren
dc.subjectanomalyen
dc.subjectdetectionen
dc.subjectdeepen
dc.subjectlearningen
dc.subjectneuralen
dc.subjectnetworken
dc.subjectmodelen
dc.subjectmodelsen
dc.subjectcomplexityen
dc.subjectlayersen
dc.subjectnodesen
dc.titleOUTLIER DETECTION BY MODEL COMPLEXITY A NEW DEEP LEARNING METHODen
dc.typeThesisen
thesis.degree.departmentComputer Science and Engineeringen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorTexas A&M Universityen
thesis.degree.nameMaster of Scienceen
thesis.degree.levelMastersen
dc.contributor.committeeMemberChaspari, Theodora
dc.contributor.committeeMemberKameoka, Jun
dc.type.materialtexten
dc.date.updated2021-01-29T15:01:30Z
local.etdauthor.orcid0000-0003-4252-0381


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record