Early detection of Sinhala language fake news in social media networks

Hathnapitiya, H.G.H.S

dc.contributor.advisor	Ahangama S
dc.contributor.advisor	Adikari S
dc.contributor.author	Hathnapitiya, H.G.H.S
dc.date.accessioned	2024-10-10T08:25:35Z
dc.date.available	2024-10-10T08:25:35Z
dc.date.issued	2024
dc.identifier.citation	Hathnapitiya, H.G.H.S. (2024). Early detection of Sinhala language fake news in social media networks [Master's theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.uom.lk/handle/123/22899
dc.identifier.uri	http://dl.lib.uom.lk/handle/123/22899
dc.description.abstract	With human evolution, people invented new technologies to make life easier. In the early twentieth century, people read newspapers, listened to radio, and watched television to gather information. With the refinement of technologies, tech people introduced social media platforms to connect with people. Busy modern people started to browse and rely on these platforms to gather news while losing interest in traditional platforms. Social media is easy to access and cost-effective. These platforms can be effortlessly used for propagating fake news content and misleading people for personal, political, or religious benefits. Society must have a proper mechanism to avoid the spread of false information. The knowledge of human experts can be used to overcome the issue by manually investigating news content. However, it requires many human experts, and it consumes time. The study introduced an automated system to detect Sinhala fake news published on social media when the content is published. The data set was created by gathering news from Facebook, which was proven fake by Sri Lankan fact-checkers or legitimate by Sri Lankan news broadcasting channels. The proposed method considered content-related features with deep learning and machine learning techniques. The deep learning model was implemented by extracting Sinhala POS tags and their TF-IDF values combined with XLM-R embeddings. The introduced deep learning approach achieved 86% accuracy. The machine learning approach used TF-IDF values of Sinhala POS tags, FastText embeddings, and punctuation count. The proposed machine learning approach achieved 85% accuracy. The proposed methods can identify fake news early, preventing its spread. The performance can be further enhanced by increasing the dataset size by collecting more data. Keywords – Sinhala fake news, social media, content-related features, natural language processing (NLP), deep learning (DL), machine learning (ML)	en_US
dc.language.iso	en	en_US
dc.subject	SOCIAL MEDIA
dc.subject	SINHALA FAKE NEWS
dc.subject	NATURAL LANGUAGE PROCESSING (NLP)
dc.subject	CONTENT-RELATED FEATURES
dc.subject	MACHINE LEARNING (ML)
dc.subject	DEEP LEARNING (DL)
dc.subject	INFORMATION TECHNOLOGY COMPUTER SCIENCE- Dissertation
dc.subject	MSc (Major Component Research)
dc.title	Early detection of Sinhala language fake news in social media networks	en_US
dc.type	Thesis-Abstract	en_US
dc.identifier.faculty	IT	en_US
dc.identifier.degree	MSc in Information Technology By research	en_US
dc.identifier.department	Department of Information Technology	en_US
dc.date.accept	2024
dc.identifier.accno	TH5543	en_US