Application of Finite Mixture Models for Vehicle Crash Data Analysis
MetadataShow full item record
Developing sound or reliable statistical models for analyzing vehicle crashes is very important in highway safety studies. A difficulty arises when crash data exhibit overdispersion. Over-dispersion caused by unobserved heterogeneity is a serious problem and has been addressed in a variety ways within the negative binomial (NB) modeling framework. However, the true factors that affect heterogeneity are often unknown to researchers, and failure to accommodate such heterogeneity in the model can undermine the validity of the empirical results. Given the limitations of the NB regression model for addressing over-dispersion of crash data due to heterogeneity, this research examined an alternative model formulation that could be used for capturing heterogeneity through the use of finite mixture regression models. A Finite mixture of Poisson or NB regression models is especially useful when the count data were generated from a heterogeneous population. To evaluate these models, Poisson and NB mixture models were estimated using both simulated and empirical crash datasets, and the results were compared to those from a single NB regression model. For model parameter estimation, a Bayesian approach was adopted, since it provides much richer inference than the maximum likelihood approach. Using simulated datasets, it was shown that the single NB model is biased if the underlying cause of heterogeneity is due to the existence of multiple counting processes. The implications could be poor prediction performance and poor interpretation. Using two empirical datasets, the results demonstrated that a two-component finite mixture of NB regression models (FMNB-2) was quite enough to characterize the uncertainty about the crash occurrence, and it provided more opportunities for interpretation of the dataset which are not available from the standard NB model. Based on the models from the empirical dataset (i.e., FMNB-2 and NB models), their relative performances were also examined in terms of hotspot identification and accident modification factors. Finally, using a simulation study, bias properties of the posterior summary statistics for dispersion parameters in FMNB-2 model were characterized, and the guidelines on the choice of priors and the summary statistics to use were presented for different sample sizes and sample-mean values.
Negative binomial regression model
Latent class model
Park, Byung Jung (2010). Application of Finite Mixture Models for Vehicle Crash Data Analysis. Doctoral dissertation, Texas A&M University. Available electronically from