Classification of cyberbullying Sinhala language comments on social media

dc.contributor.authorAmali, HMAI
dc.contributor.authorJayalal, S
dc.contributor.authorJayalal, S
dc.contributor.editorWeeraddana, C
dc.contributor.editorEdussooriya, CUS
dc.contributor.editorAbeysooriya, RP
dc.date.accessioned2022-08-09T09:32:36Z
dc.date.available2022-08-09T09:32:36Z
dc.date.issued2020-07
dc.description.abstractDue to technological revolution over the years, bullying which was confined to physical boundaries has now moved online. Denigration or insult is one form of cyberbullying. According to Sri Lanka Computer Emergency Readiness Team, social media cyberbullying incidents are escalating. Insulting words are dynamic, and same word can have several meanings according to the context. Simply because a comment contains such a word, it cannot be classified as bullying. Hence, when labeling comments, simple keyword spotting techniques are inadequate. Other languages have addressed this issue using lexical databases such as WordNet which provides synonyms and homonyms of words. Since there is no proper lexical database developed for Sinhala language, detecting a word as bullying is a challenge. Therefore, we used rules to overcome this issue. Twitter comments with profane words were collected, outliers were removed, and remaining tweets were pre-processed. To determine insult in the text, five rules were used for feature extraction. Afterward, we applied Support Vector Machine (SVM), K-nearest neighbor (KNN) and Naïve Bayes algorithms. The results show that SVM with an RBF kernel performs better with an F1-score of 91%. Novelty of this research is the focus on Sinhala language cyberbully detection which has not been addressed before.en_US
dc.identifier.citationH. M. A. Ishara Amali and S. Jayalal, "Classification of Cyberbullying Sinhala Language Comments on Social Media," 2020 Moratuwa Engineering Research Conference (MERCon), 2020, pp. 266-271, doi: 10.1109/MERCon50084.2020.9185209.en_US
dc.identifier.conferenceMoratuwa Engineering Research Conference 2020en_US
dc.identifier.departmentEngineering Research Unit, University of Moratuwaen_US
dc.identifier.doi10.1109/MERCon50084.2020.9185209en_US
dc.identifier.emailamalihma_im14002@stu.kln.ac.lken_US
dc.identifier.emailshantha@kln.ac.lken_US
dc.identifier.facultyEngineeringen_US
dc.identifier.pgnospp. 266-271en_US
dc.identifier.placeMoratuwa, Sri Lankaen_US
dc.identifier.proceedingProceedings of Moratuwa Engineering Research Conference 2020en_US
dc.identifier.urihttp://dl.lib.uom.lk/handle/123/18582
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.relation.urihttps://ieeexplore.ieee.org/document/9185209en_US
dc.subjectcyberbullyingen_US
dc.subjectsocial mediaen_US
dc.subjecttext miningen_US
dc.subjectsentiment analysisen_US
dc.subjectmachine learningen_US
dc.titleClassification of cyberbullying Sinhala language comments on social mediaen_US
dc.typeConference-Full-texten_US

Files

Collections