Institutional-Repository, University of Moratuwa.  

Monolingual sentence similarity measurement using siamese neural networks for sinhala and tamil languages

Show simple item record

dc.contributor.author Nilaxan, S
dc.contributor.author Ranathunga, S
dc.contributor.editor Adhikariwatte, W
dc.contributor.editor Rathnayake, M
dc.contributor.editor Hemachandra, K
dc.date.accessioned 2022-10-19T05:49:35Z
dc.date.available 2022-10-19T05:49:35Z
dc.date.issued 2021-07
dc.identifier.citation S. Nilaxan and S. Ranathunga, "Monolingual Sentence Similarity Measurement using Siamese Neural Networks for Sinhala and Tamil Languages," 2021 Moratuwa Engineering Research Conference (MERCon), 2021, pp. 567-572, doi: 10.1109/MERCon52712.2021.9525786. en_US
dc.identifier.uri http://dl.lib.uom.lk/handle/123/19133
dc.description.abstract Sentence similarity is useful in many Natural Language Processing tasks such as plagiarism checking and paraphrasing. So far, only conventional unsupervised sentence similarity measurement techniques (knowledge-based, corpus-based, string similarity-based, and hybrid) have been used to measure sentence similarity for Tamil and Sinhala languages. In this paper, we present a Deep Learning technique to measure sentence similarity for these two languages, which makes use of a Siamese Neural Network that consists of two Long Short-Term Memory (LSTM) networks, and neural word embeddings as the input representation. This approach achieved a 3.07% higher Pearson correlation coefficient for the dataset of 2500 Tamil sentence pairs, and a 3.61% higher Pearson correlation for the dataset of 5000 Sinhala sentence pairs over the conventional unsupervised sentence similarity measurement techniques. en_US
dc.language.iso en en_US
dc.publisher IEEE en_US
dc.relation.uri https://ieeexplore.ieee.org/document/9525786/ en_US
dc.subject sentence similarity en_US
dc.subject siamese neural networks en_US
dc.subject long short-term memory (LSTM) en_US
dc.subject Sinhala en_US
dc.subject Tamil en_US
dc.subject Word embeddings en_US
dc.subject FastText en_US
dc.title Monolingual sentence similarity measurement using siamese neural networks for sinhala and tamil languages en_US
dc.type Conference-Full-text en_US
dc.identifier.faculty Engineering en_US
dc.identifier.department Engineering Research Unit, University of Moratuwa en_US
dc.identifier.year 2021 en_US
dc.identifier.conference Moratuwa Engineering Research Conference 2021 en_US
dc.identifier.place Moratuwa, Sri Lanka en_US
dc.identifier.pgnos pp. 567-572 en_US
dc.identifier.proceeding Proceedings of Moratuwa Engineering Research Conference 2021 en_US
dc.identifier.doi 10.1109/MERCon52712.2021.9525786 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record