An Optimized supervised learning model for predicting survival rate

Loading...
Thumbnail Image

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The incorporation of machine-learning techniques in medical informatics has facilitated major improvements in cancer survivability prognosis. In this study, the concentration is given on improving the accuracy for the prediction of breast cancer survivability and to come up with a prediction model with the METABRIC dataset, which have both clinical and genomic features. Further improvement of the prediction model through feature engineering and missing data handling was another main objective of this study. The study was started with several ML algorithms such as Logistic Regression, Support Vector Classification, Random Forest, Categorical Boosting and Extreme Gradient Boosting with the objective of selecting the best algorithm for this dataset and to further improve the prediction accuracy with algorithm customizations. Of these, algorithms Extreme Gradient Boosting algorithm provided the best baseline accuracy (81.10%), and then further improvements were done to prediction model pipeline to improve prediction accuracy through kernel- based multiple imputation for missing values in the dataset and also with advanced feature engineering techniques like log transformation , interaction features and categorical feature encoding on the METABRIC dataset which resulted in increase of prediction accuracy up to 87.5% .With the objective of further improvement of prediction accuracy , the Extreme Gradient Boosting Algorithm was customized with a new objective function composite with Asymmetric Cost Sensitivity and Smoothed Focal Loss which resulted in further prediction accuracy improvement of 1.5%.The final proposed model pipeline with customized Extreme Gradient Boosting algorithm offers highly accurate and clinically aligned survivability prediction model which can be used as the base for disease prognosis using transfer learning

Description

Citation

Saumya, T.M.D. (2025). An Optimized supervised learning model for predicting survival rate [Master's theses, University of Moratuwa]. Institutional Repository University of Moratuwa. https://dl.lib.uom.lk/handle/123/24853

DOI

Endorsement

Review

Supplemented By

Referenced By