
The aim of this Diploma Thesis is to develop classification models for identifying risky driving behavior and categorizing it into three different safety levels. To achieve this, valuable data on driver behavior were collected through a driving experiment conducted under real-world conditions in Belgium and the United Kingdom. In the initial analysis, the importance of the variables was calculated using the “Random Forest” algorithm, based on which nine input variables were selected for further analysis. Then, to address the issue of data imbalance, the SMOTE oversampling method was applied. Upon completing these two steps, four classification algorithms for driving behavior were developed, for which Confusion Matrices were calculated, followed by a comparison of their evaluation metrics. Subsequently, SHAP values were examined to further understand the influence of each selected input variable. This approach allowed for the calculation of the average SHAP importance, leading to the selection of the CatBoost and LightGBM models as the most effective. Finally, the importance of the variables for the two selected models was visualized using the SHAP method for the three safety levels, and the influence of each variable on changes in safety level was analyzed separately. The average speed of the vehicle was identified as the most significant variable, while sudden driving events, including both harsh acceleration and harsh braking, were found to significantly influence the classification of driving behavior as dangerous.
ID | ad169 |
Presentation | |
Full Text | |
Tags |