Improved Guidelines for Recalibration of Predictive Models over Time Based on Model Uncertainty
Abstract
The Highway Safety Manual (HSM) summarizes the safety performance functions (SPFs) of various facility types. The primary use of SPFs is to estimate the safety performance (i.e., the number of crashes by severity level) of different facilities based on geometric and traffic variables. The SPFs were developed using the negative binomial (NB) regression model based on crash data obtained from a selected number of states and cities in the United States and Canada. Applied directly to the local jurisdictions, SPFs may yield biased or incorrect results. Therefore, calibration of the SPFs or predictive models is an important step before applying them to local jurisdictions. Moreover, it is also necessary to recalibrate SPFs over time to account for variations in factors that cannot be accounted for directly in SPFs, such as changes in driver behavior, crash-reporting thresholds, etc. The calibration factor (for a specific facility type) is defined as the ratio of the observed number of crashes to the predicted number of crashes. The HSM recommends that SPFs be recalibrated every 2 to 3 years. However, these guidelines are not based on sound research or reliable criteria. The lack of appropriate guidelines can lead to two types of errors: recalibrating of the models when it is not needed, and not recalibrating them when such a need arises.
The aim of this thesis is to develop guidelines regarding when or how often SPFs should be recalibrated. To this end, two methodologies were created related to the variance or uncertainty associated with the SPFs, and the guidelines were developed using statistical principles. These guidelines are that SPFs should be recalibrated when (i) the total number of crashes that occur in a network of similar types of facilities falls beyond the prediction intervals of the predicted or estimated total number of crashes in that same network; or (ii) the calibration factor developed in a specific year is statistically significantly different than 1 (based on coefficient of variation (CV) of the SPF and the Calibration Factor C).
Both approaches were tested on several intersections and segment datasets from Michigan and Toronto. The results show that both approaches are feasible and could provide safety analysts with better and more reliable guidelines regarding when SPFs should be recalibrated. However, the methodologies developed in this thesis cannot be applied to the SPFs developed in the HSM since the information needed to evaluate the variance of SPFs is not available in this manual. The results of both the methodologies were compared to the results of a methodology recently proposed in the literature that can be applied to HSM SPFs and uses a fixed threshold value of C-factor error estimate (say 10%). This study indicated that the 10% error is a reasonable value to use for re-calibrating models.
The shortcomings of these methodologies include the need to develop a new SPF (which is time-consuming and work-intensive process) and to collect extensive data every year. When data is available every year, the practitioner might as well estimate a new calibration factor every year instead of needing to know the frequency of recalibration or use an approximate method (Cproxy). Future research in this area should focus on identifying the minimum data requirements for both methodologies proposed in this thesis.
Citation
Bommanayakanahalli, Bharadwaj (2018). Improved Guidelines for Recalibration of Predictive Models over Time Based on Model Uncertainty. Master's thesis, Texas A & M University. Available electronically from https : / /hdl .handle .net /1969 .1 /173402.