Institutional-Repository, University of Moratuwa.  

Monolingual sentence similarity measurement using siamese neural networks for Sinhala and Tamil languages

Show simple item record

dc.contributor.advisor Ranathunga S
dc.contributor.author Satkunanantham N
dc.date.accessioned 2021
dc.date.available 2021
dc.date.issued 2021
dc.identifier.citation Satkunanantham, N. (2021). Monolingual sentence similarity measurement using siamese neural networks for Sinhala and Tamil languages [Master's theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.uom.lk/handle/123/20465
dc.identifier.uri http://dl.lib.uom.lk/handle/123/20465
dc.description.abstract Sentence similarity plays a key role in text-processing related research such as plagiarism checking and paraphrasing. So far, only conventional unsupervised sentence similarity techniques such as string-based, corpus-based, knowledge-based, and hybrid approaches have been used to measure sentence similarity for Tamil and Sinhala languages. In this research, we introduce a Deep Learning methodology to measure sentence similarity for these two languages, which makes use of Siamese Recurrent Neural Networks techniques together with a word-embedding model as the input representation. This approach achieved a 3.07% higher Pearson correlation coefficient for the Tamil dataset of 2500 sentence pairs and a 3.61% higher Pearson correlation coefficient for the Sinhala dataset of 5000 sentence pairs. Both these results outperform that of the conventional unsupervised sentence similarity techniques applied on the same datasets. en_US
dc.language.iso en en_US
dc.subject SENTENCE-SIMILARITY en_US
dc.subject SINHALA, TAMIL en_US
dc.subject SIAMESE NEURAL NETWORK en_US
dc.subject LSTM en_US
dc.subject DEEP-LEARNING en_US
dc.subject FASTTEXT en_US
dc.subject NATURAL LANGUAGE PROCESSING en_US
dc.subject COMPUTER SCIENCE - Dissertation en_US
dc.subject COMPUTER SCIENCE & ENGINEERING - Dissertation en_US
dc.subject INFORMATION TECHNOLOGY – Dissertation en_US
dc.title Monolingual sentence similarity measurement using siamese neural networks for Sinhala and Tamil languages en_US
dc.type Thesis-Abstract en_US
dc.identifier.faculty Engineering en_US
dc.identifier.degree MSc in Computer Science and Engineering en_US
dc.identifier.department Department of Computer Science & Engineering en_US
dc.date.accept 2021
dc.identifier.accno TH4661 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record