Institutional-Repository, University of Moratuwa.  

Short Tamil sentence similarity calculation using knowledge-based and corpus-based similarity measures

Show simple item record

dc.contributor.author Selvarasa, A
dc.contributor.author Thirunavukkarasu, N
dc.contributor.author Rajendran, N
dc.contributor.author Yogalingam, C
dc.contributor.author Ranathunga, S
dc.contributor.author Dias, G
dc.date.accessioned 2018-08-20T21:00:28Z
dc.date.available 2018-08-20T21:00:28Z
dc.identifier.uri http://dl.lib.mrt.ac.lk/handle/123/13405
dc.description.abstract Sentence similarity calculation plays an important role in text processing-related research. Many unsupervised techniques such as knowledge-based techniques, corpus-based techniques, string similarity based techniques, and graph alignment techniques are available to measure sentence similarity. However, none of these techniques have been experimented with Tamil. In this paper, we present the first-ever system to measure semantic similarity for Tamil short phrases using a hybrid approach that makes use of knowledge-based and corpus-based techniques. We tested this system with 2000 general sentence pairs and 100 mathematical sentence pairs. For the dataset of 2000 sentence pairs, this approach achieved a Mean Squared Error of 0.195 and a Pearson Correlation factor of 0.815. For the 100 mathematical sentence pairs, this approach achieved an 85% of accuracy. en_US
dc.language.iso en en_US
dc.subject Sentence similarity, Tamil, Knowledge-based, corpus-basedcorpus-based en_US
dc.title Short Tamil sentence similarity calculation using knowledge-based and corpus-based similarity measures en_US
dc.type Conference-Abstract en_US
dc.identifier.faculty Engineering en_US
dc.identifier.department Department of Computer Science and Engineering en_US
dc.identifier.year 2017 en_US
dc.identifier.conference Moratuwa Engineering Research Conference - MERCon 2017 en_US
dc.identifier.place Moratuwa, Sri Lanka en_US
dc.identifier.email anutharsha.12@cse.mrt.ac.lk en_US
dc.identifier.email nilathiru.12@cse.mrt.ac.lk en_US
dc.identifier.email niveathika.12@cse.mrt.ac.lk en_US
dc.identifier.email chinthoorie.12@cse.mrt.ac.lk en_US
dc.identifier.email surangika@cse.mrt.ac.lk en_US
dc.identifier.email gihan@uom.lk en_US


Files in this item

This item appears in the following Collection(s)

  • 2014-9th [41]
    Conference Proceedings 9th Asia Pacific Conference on Transportation and the Environment

Show simple item record