Institutional-Repository, University of Moratuwa.  

Building explanatory models for road crash analysis using data science and machine learning technologies

Show simple item record

dc.contributor.advisor Perera L
dc.contributor.author De Silva HWIU
dc.date.accessioned 2022
dc.date.available 2022
dc.date.issued 2022
dc.identifier.citation De Silva, H.W.I.U. (2022). Building explanatory models for road crash analysis using data science and machine learning technologies [Master's theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.uom.lk/handle/123/19697
dc.identifier.uri http://dl.lib.uom.lk/handle/123/19697
dc.description.abstract Over three thousand people die annually on the roads of Sri Lanka due to traffic crashes. This is a massive socio and economic problem faced by the country. Road crashes globally cause more than 1.3 million fatalities every year and are the eighth leading cause of death worldwide. Traditionally, road traffic crash analysis and accident modeling resorted to regression models and discrete choice models based on past data. Many countermeasures have been identified and implemented addressing the issues highlighted through such models. Since road traffic crashes occur across space and time, the conventional numerical approaches have failed to provide alerts and insights in relation to geospatial regions. Also, having to handcraft these models limits the explainability that can be leveraged with the help of advanced tools and techniques available in modern data science and machine learning disciplines. Further, the disjointed efforts in building analytical models or geospatial models on available crash data (e.g., crash hotspot identification) limit road agencies’ abilities in prioritizing funds allocation for more impactful improvements. Due to the difficulty in identifying patterns in causal factors of accident risks using conventional or isolated methods, the authorities also find it difficult to prioritize their staff strength in high-risk areas. The combination of exploratory data analysis (EDA), machine learning models, and modern geospatial visualization tools offer a unique opportunity to fill these gaps cost-effectively. This study presents an application of the latest data science and machine learning technologies to build explanatory models that help analyze road crashes. Popular packages written in Python and Javascript programming languages were used. Pandas and SweetViz libraries provided simple, yet powerful EDA. GeoPandas library provided the ability to process GPS locations (latitude and longitude) while Matplotlib was used to generate static maps. Folium library and the underlying Leaflet.js library were applied to generate interactive maps to help visualize crash hot spots. Two leading gradient boosting techniques, namely LightGBM and Catboost were applied to build models that highlight causal factors via feature importance estimation methods. The study developed algorithms, methods, and charts to generate attribute correlation and gradient boosted decision tree models to relate accident severity with recorded data sets and interactions of certain aggregate features (e.g., weather, and light condition). The visualization efforts produced road crash density maps by administrative region size and population Interactive maps that allow authorities to drill down (or zoom in) to hot spots were also developed. The programmatic approach developed in this study enables the repeatable application of the explanatory analysis and visualizations to new and old datasets with minimal effort. The findings from the study lay the foundation for a digital system that can be easily converted to an online platform for road and enforcement agencies to obtain reports and alerts on road crash risks and hot spots. The application was tested using crash data in Sri Lanka and the outcomes are presented in this study. Future work on the fusion of multiple data sources such as real-time weather data and traffic congestion levels onto the same platform can enhance these outcomes to even near real-time crash prediction to further assist proactive accident prevention measures. en_US
dc.language.iso en en_US
dc.subject ROAD SAFETY en_US
dc.subject EXPLANATORY MODELS en_US
dc.subject GEOSPATIAL CRASH VISUALIZATION en_US
dc.subject MULTI-FACETED ANALYSIS en_US
dc.subject ROAD CRASHES en_US
dc.subject EXPLORATORY DATA ANALYSIS en_US
dc.subject MACHINE LEARNING CRASH MODELS en_US
dc.subject TRANSPORTATION - Dissertation en_US
dc.subject CIVIL ENGINEERING - Dissertation en_US
dc.title Building explanatory models for road crash analysis using data science and machine learning technologies en_US
dc.type Thesis-Abstract en_US
dc.identifier.faculty Engineering en_US
dc.identifier.degree M.Sc. in Transportation en_US
dc.identifier.department Department of Civil Engineering en_US
dc.date.accept 2022
dc.identifier.accno TH4919 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record