Masud S., Mahajan K., Kondyli A., Deliali K., Yannis G., “Leveraging Machine Learning Algorithms to Predict and Analyze Single-Vehicle and Multi-Vehicle Crash Occurrences on Motorways”, Proceedings of the Transportation Research Board (TRB) 103rd Annual Meeting, Washington, 7-11 January 2024

Road crashes are a common occurrence in many parts of the world, causing significant loss of life, injury, and economic damage. Crashes can be broadly classified into single-vehicle crashes (SV) and multi-vehicle crashes (MV). Various statistical approaches have been implemented to identify the key factors behind these two types of crashes and it has been concluded that these factors need to be analyzed separately. The dataset for this research included various types of roadway design parameters and traffic conditions. Combinations of three feature-selection techniques such as ANOVA, correlation matrix, and ExtraTreesClassifier algorithm were utilized to separately select the appropriate variables for SV and MV crash analysis. Various Machine Learning (ML) models (e.g., LightGBM, XGBoost, etc.) along with a statistical method (binary logistic regression) have been adopted to predict SV and MV crash occurrences. The results show that gradient-boosting type ML algorithms outperform the remaining prediction models and the LightGBM was found to be the most powerful in prediction. The LightGBM classifier produced accuracy, ROC_AUC, and avg. F-1 score of 0.75, 0.83, and 0.76 respectively for MV crashes and 0.76, 0.82, and 0.76 respectively for SV crashes. The SHapley Additive exPlanations (SHAP) analysis was used to explain how each variable impacted the models’ output. The results confirmed that the crash factors associated with SV and MV crashes are different and that some variables have inverse impact. Artificial intelligence and ML can assist transportation professionals in better understanding the causes of SV and MV crashes and advance the process toward Vision Zero