Understanding the Factors Affecting Safety of E-Scooter and Bicycle Users in Urban Environments: An Injury Severity Analysis Using Machine Learning and Natural Language Processing
Abstract
This study analyzes data from 153 e-scooter crashes extracted from the Crash Records Information System (CRIS) using text mining and 819 bicycle crashes from 2018 to 2021 obtained from the City of Austin. Natural language analysis is used to extract new feature related to the use of different words in the crash report. Synthetic minority oversampling technique (SMOTE) combined with the NearMiss algorithm is used for solving imbalanced class problem. The recall score of the models—a logistic regression classifier and two tree-based machine learning methods for injury severity classification—increased by approximately tenfold when using the resampled dataset. The results from SHAP analysis provide a greater understanding of the similarities and differences in contributing factors affecting crash injury severities of e-scooter and bicycle riders. Overall, the findings suggest the need for targeted interventions to improve the safety of both e-scooter and bicycle riders such as improving e-scooter-friendly urban environment and bike facility design as well as increasing education and awareness campaigns for users to share the road safely.
Citation
Koirala, Pranik (2023). Understanding the Factors Affecting Safety of E-Scooter and Bicycle Users in Urban Environments: An Injury Severity Analysis Using Machine Learning and Natural Language Processing. Master's thesis, Texas A&M University. Available electronically from https : / /hdl .handle .net /1969 .1 /199149.