An Analytical study of pre - trained models for sentiment analysis of sinhala news comments

dc.contributor.advisorThayasivam U
dc.contributor.authorDissanayake MLS
dc.date.accept2022
dc.date.accessioned2022
dc.date.available2022
dc.date.issued2022
dc.description.abstractIn the area of natural language processing, due to the large-scale text data availability sentiment analysis has become a prevalence topic. Sentiment analysis is a text classification which is mainly focusing on classifying recommendations and reviews as positive or negative. Earlier for this classification task, most of methods require product reviews and label them. Using these reviews then a classifier is trained with their relevant labels. For this training procedure a huge number of labeled data is needed to train these classification models for each of the product, considering the facts that the distribution of the reviews can be different between different domains and to enhance the performance of these classification models. Nevertheless, the procedure of labeling the data is very expensive and time consuming. For low resource languages like Sinhala language, the existence of annotated Sinhala data is limited compared to the languages like English language. The need of applying classification algorithms in order to perform sentiment classification for Sinhala language is challenging. Apart from applying traditional algorithms to analyze sentiments, here using pre-trained models(PTM)s, experimenting on whether the outcome of these experiments outperform the traditional methods. In natural language processing, PTM is performing an important role, since it paves the way for applying PTMs for downstream tasks. Therefore, this research takes the step to applying PTMs such as BERT and XLnet to classify sentiments. Experiments have been done using two approaches on BERT model as fine tuning the BERT model and feature based approach. Also using the existing Roberta-based Sinhala models, named as SinBERT-small and SinBERT-large which are available in Huggingface official site which have trained using a large Sinhala language corpus.en_US
dc.identifier.accnoTH5120en_US
dc.identifier.citationDissanayake, M.L.S. (2022). An Analytical study of pre - trained models for sentiment analysis of sinhala news comments [Master's theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.uom.lk/handle/123/22462
dc.identifier.degreeMSc in Computer Science & Engineeringen_US
dc.identifier.departmentDepartment of Computer Science & Engineeringen_US
dc.identifier.facultyEngineeringen_US
dc.identifier.urihttp://dl.lib.uom.lk/handle/123/22462
dc.language.isoenen_US
dc.subjectTRANSFER LEARNINGen_US
dc.subjectSENTIMENT ANALYSISen_US
dc.subjectPRE-TRAINED MODELSen_US
dc.subjectSINHALA NEWS COMMENTSen_US
dc.subjectCOMPUTER SCIENCE- Dissertationen_US
dc.subjectCOMPUTER SCIENCE & ENGINEERING - Dissertationen_US
dc.titleAn Analytical study of pre - trained models for sentiment analysis of sinhala news commentsen_US
dc.typeThesis-Abstracten_US

Files

Original bundle

Now showing 1 - 3 of 3
Loading...
Thumbnail Image
Name:
TH5120-1.pdf
Size:
137.88 KB
Format:
Adobe Portable Document Format
Description:
Pre-Text
Loading...
Thumbnail Image
Name:
TH5120-2.pdf
Size:
155.1 KB
Format:
Adobe Portable Document Format
Description:
Post- Text
Loading...
Thumbnail Image
Name:
TH5120.pdf
Size:
1.42 MB
Format:
Adobe Portable Document Format
Description:
Full theses