An Analytical study of pre - trained models for sentiment analysis of sinhala news comments

Dissanayake MLS

An Analytical study of pre - trained models for sentiment analysis of sinhala news comments

dc.contributor.advisor	Thayasivam U
dc.contributor.author	Dissanayake MLS
dc.date.accept	2022
dc.date.accessioned	2022
dc.date.available	2022
dc.date.issued	2022
dc.description.abstract	In the area of natural language processing, due to the large-scale text data availability sentiment analysis has become a prevalence topic. Sentiment analysis is a text classification which is mainly focusing on classifying recommendations and reviews as positive or negative. Earlier for this classification task, most of methods require product reviews and label them. Using these reviews then a classifier is trained with their relevant labels. For this training procedure a huge number of labeled data is needed to train these classification models for each of the product, considering the facts that the distribution of the reviews can be different between different domains and to enhance the performance of these classification models. Nevertheless, the procedure of labeling the data is very expensive and time consuming. For low resource languages like Sinhala language, the existence of annotated Sinhala data is limited compared to the languages like English language. The need of applying classification algorithms in order to perform sentiment classification for Sinhala language is challenging. Apart from applying traditional algorithms to analyze sentiments, here using pre-trained models(PTM)s, experimenting on whether the outcome of these experiments outperform the traditional methods. In natural language processing, PTM is performing an important role, since it paves the way for applying PTMs for downstream tasks. Therefore, this research takes the step to applying PTMs such as BERT and XLnet to classify sentiments. Experiments have been done using two approaches on BERT model as fine tuning the BERT model and feature based approach. Also using the existing Roberta-based Sinhala models, named as SinBERT-small and SinBERT-large which are available in Huggingface official site which have trained using a large Sinhala language corpus.	en_US
dc.identifier.accno	TH5120	en_US
dc.identifier.citation	Dissanayake, M.L.S. (2022). An Analytical study of pre - trained models for sentiment analysis of sinhala news comments [Master's theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.uom.lk/handle/123/22462
dc.identifier.degree	MSc in Computer Science & Engineering	en_US
dc.identifier.department	Department of Computer Science & Engineering	en_US
dc.identifier.faculty	Engineering	en_US
dc.identifier.uri	http://dl.lib.uom.lk/handle/123/22462
dc.language.iso	en	en_US
dc.subject	TRANSFER LEARNING	en_US
dc.subject	SENTIMENT ANALYSIS	en_US
dc.subject	PRE-TRAINED MODELS	en_US
dc.subject	SINHALA NEWS COMMENTS	en_US
dc.subject	COMPUTER SCIENCE- Dissertation	en_US
dc.subject	COMPUTER SCIENCE & ENGINEERING - Dissertation	en_US
dc.title	An Analytical study of pre - trained models for sentiment analysis of sinhala news comments	en_US
dc.type	Thesis-Abstract	en_US

Files

Original bundle

Now showing 1 - 3 of 3

Name:: TH5120-1.pdf
Size:: 137.88 KB
Format:: Adobe Portable Document Format
Description:: Pre-Text

Download

Name:: TH5120-2.pdf
Size:: 155.1 KB
Format:: Adobe Portable Document Format
Description:: Post- Text

Download

Name:: TH5120.pdf
Size:: 1.42 MB
Format:: Adobe Portable Document Format
Description:: Full theses

Download

Collections

Master of Science in Computer science and Engineering