Show simple item record

dc.contributor.advisorZhang, Yunlong
dc.contributor.advisorWang, Xiubin B
dc.creatorWei, Zihang
dc.date.accessioned2022-01-27T22:16:00Z
dc.date.available2023-08-01T06:41:54Z
dc.date.created2021-08
dc.date.issued2021-07-15
dc.date.submittedAugust 2021
dc.identifier.urihttps://hdl.handle.net/1969.1/195342
dc.description.abstractConventional traffic crash analysis methods often use highly aggregated data, making it difficult to understand the effects of many time-varying factors on crash occurrence. Although studies have used data with small aggregation intervals, they typically analyze the effect of a single factor on crash occurrence. In this study, the collaborative effect of roadway geometry, speed distribution, and weather conditions on crash occurrence and severity is investigated using an interpretable or explainable machine learning method XGBoost (eXtreme Gradient Boosting) on daily level crash data. The data are collected from four different sources on roadways in Texas. Three roadway facility types are considered in this study: (1) Rural Interstate; (2) Rural Two-Lane; (3) Rural Multilane. In the feature selection process, the Pearson correlation coefficient is applied to remove highly correlated variables. The study then uses the synthetic minority over-sampling technique (SMOTE) method to mitigate the data imbalance issue. The XGBoost model is trained twice: first on data with all crash severity levels, and then only on data with fatal and severe injury crash levels. Finally, the SHAP (SHapley Additive exPlanation) method is applied to investigate the contribution of all variables on the model’s output. The results show that on different roadways facility types the contributions of variables tend to be different, and moreover, the variables also contribute differently on crashes with different severity levels.en
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectCrash Occurrenceen
dc.subjectMachine Learningen
dc.subjectXGBoosten
dc.subjectSMOTEen
dc.subjectSHAPen
dc.subjectData Scienceen
dc.titleINVESTIGATING THE IMPACT OF ROADWAY GEOMETRY, SPEED DISTRIBUTION, AND WEATHER CONDITION ON ROADWAY DAILY CRASH OCCURRENCE AND SEVERITY BY USING MACHINE LEARNING METHODSen
dc.typeThesisen
thesis.degree.departmentCivil and Environmental Engineeringen
thesis.degree.disciplineCivil Engineeringen
thesis.degree.grantorTexas A&M Universityen
thesis.degree.nameMaster of Scienceen
thesis.degree.levelMastersen
dc.contributor.committeeMemberJones, David E
dc.contributor.committeeMemberDas, Subasish
dc.type.materialtexten
dc.date.updated2022-01-27T22:16:01Z
local.embargo.terms2023-08-01
local.etdauthor.orcid0000-0002-1790-022X


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record