Identify hateful comments in Sinhala language on social media

dc.contributor.advisorPremaratne S.C
dc.contributor.authorFernando W.W.E.N.
dc.date.accept2021
dc.date.accessioned2021
dc.date.available2021
dc.date.issued2021
dc.description.abstractIn present, the spread of hate speech through social media has become a very serious problem, both globally and locally. The route causefor this is the increasing use of social media with the rapid expansionof computer science and information technology. Therefore, it is very important to use sameto control this kindof situations. Although there is a mechanism in place on social media to automatically controlsuch hate speech in English language, but it is still not seen in Sinhala Language.The reason for this is the lack of knowledge about the native languages such as Sinhala in the social media service providers. Therefore, the identification of hatefulcontentsin Sinhala language is an urgent and vitaltask that needs to be addressed. This research propose lexicon based and machine learning based approaches for the automatic identification of hatefulspeech in Sinhala on social media. With different pre-processing techniques and machine learning algorithms, machine learning algorithm based approach was conducted with four different approaches. These approaches were begun with 3000 comments which is equally divided into hateful and non-hateful. Using these comments, it was able to identify the most appropriate featured groups and model toidentify the hatefulspeech in Sinhala language on social media.en_US
dc.identifier.accnoTH4553en_US
dc.identifier.citationFernando, W. W. E. N. (2021). Identify hateful comments in Sinhala language on social media [Masters Theses, University of Moratuwa]. University of Moratuwa Institutional Repository. http://dl.lib.uom.lk/handle/123/17553
dc.identifier.degreeMSc in Information Technologyen_US
dc.identifier.departmentDepartment of Information Technologyen_US
dc.identifier.facultyITen_US
dc.identifier.urihttp://dl.lib.uom.lk/handle/123/17553
dc.language.isoenen_US
dc.subjectSOCIAL MEDIA – Language Use – Sinhalaen_US
dc.subjectTEXT MININGen_US
dc.subjectMULTINOMIAL NAÏVE BAYESen_US
dc.subjectCONTENT IDENTIFICATIONen_US
dc.subjectINFORMATION TECHNOLOGY- Dissertationen_US
dc.subjectCOMPUTER SCIENCE – Dissertationsen_US
dc.titleIdentify hateful comments in Sinhala language on social mediaen_US
dc.typeThesis-Abstracten_US

Files