Monolingual sentence similarity measurement using siamese neural networks for Sinhala and Tamil languages
dc.contributor.advisor | Ranathunga S | |
dc.contributor.author | Satkunanantham N | |
dc.date.accept | 2021 | |
dc.date.accessioned | 2021 | |
dc.date.available | 2021 | |
dc.date.issued | 2021 | |
dc.description.abstract | Sentence similarity plays a key role in text-processing related research such as plagiarism checking and paraphrasing. So far, only conventional unsupervised sentence similarity techniques such as string-based, corpus-based, knowledge-based, and hybrid approaches have been used to measure sentence similarity for Tamil and Sinhala languages. In this research, we introduce a Deep Learning methodology to measure sentence similarity for these two languages, which makes use of Siamese Recurrent Neural Networks techniques together with a word-embedding model as the input representation. This approach achieved a 3.07% higher Pearson correlation coefficient for the Tamil dataset of 2500 sentence pairs and a 3.61% higher Pearson correlation coefficient for the Sinhala dataset of 5000 sentence pairs. Both these results outperform that of the conventional unsupervised sentence similarity techniques applied on the same datasets. | en_US |
dc.identifier.accno | TH4661 | en_US |
dc.identifier.citation | Satkunanantham, N. (2021). Monolingual sentence similarity measurement using siamese neural networks for Sinhala and Tamil languages [Master's theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.uom.lk/handle/123/20465 | |
dc.identifier.degree | MSc in Computer Science and Engineering | en_US |
dc.identifier.department | Department of Computer Science & Engineering | en_US |
dc.identifier.faculty | Engineering | en_US |
dc.identifier.uri | http://dl.lib.uom.lk/handle/123/20465 | |
dc.language.iso | en | en_US |
dc.subject | SENTENCE-SIMILARITY | en_US |
dc.subject | SINHALA, TAMIL | en_US |
dc.subject | SIAMESE NEURAL NETWORK | en_US |
dc.subject | LSTM | en_US |
dc.subject | DEEP-LEARNING | en_US |
dc.subject | FASTTEXT | en_US |
dc.subject | NATURAL LANGUAGE PROCESSING | en_US |
dc.subject | COMPUTER SCIENCE - Dissertation | en_US |
dc.subject | COMPUTER SCIENCE & ENGINEERING - Dissertation | en_US |
dc.subject | INFORMATION TECHNOLOGY – Dissertation | en_US |
dc.title | Monolingual sentence similarity measurement using siamese neural networks for Sinhala and Tamil languages | en_US |
dc.type | Thesis-Abstract | en_US |
Files
Original bundle
1 - 3 of 3
Loading...
- Name:
- TH4661-1.pdf
- Size:
- 213.2 KB
- Format:
- Adobe Portable Document Format
- Description:
- Pre-text
Loading...
- Name:
- TH4661-2.pdf
- Size:
- 143.77 KB
- Format:
- Adobe Portable Document Format
- Description:
- Post-text
Loading...
- Name:
- TH4661.pdf
- Size:
- 2.39 MB
- Format:
- Adobe Portable Document Format
- Description:
- Full-thesis