dc.contributor.author |
Sudesh, P |
|
dc.contributor.author |
Dashintha, D |
|
dc.contributor.author |
Lakshan, R |
|
dc.contributor.author |
Dias, G |
|
dc.contributor.editor |
Rathnayake, M |
|
dc.contributor.editor |
Adhikariwatte, V |
|
dc.contributor.editor |
Hemachandra, K |
|
dc.date.accessioned |
2022-10-27T09:44:47Z |
|
dc.date.available |
2022-10-27T09:44:47Z |
|
dc.date.issued |
2022-07 |
|
dc.identifier.citation |
P. Sudesh, D. Dashintha, R. Lakshan and G. Dias, "Erroff: A Tool to Identify and Correct Real-word Errors in Sinhala Documents," 2022 Moratuwa Engineering Research Conference (MERCon), 2022, pp. 1-6, doi: 10.1109/MERCon55799.2022.9906294. |
en_US |
dc.identifier.uri |
http://dl.lib.uom.lk/handle/123/19274 |
|
dc.description.abstract |
Sinhala is a low-resource Indo-Aryan language used by approximately 16 million people, mainly in Sri Lanka. Because of the complexity of the Sinhala language, detection of spelling errors is not so easy. A real-word error happens when a word is in the vocabulary but is not valid in the context in which it appears. Checking for real-word errors in a sentence is more difficult than checking for non-word errors, which are not in the vocabulary. We present the implementation of a neural-network based system for identifying real-word errors and non-word errors in Sinhala. We prepared a candidate list of real-word errors. Further, we have selected a suitable model and trained it using several different datasets. Thus, this paper sets a new baseline for the detection and correction of real-word errors in Sinhala documents. Our product, source code, candidate error list, training datasets, and evaluation dataset are publicly released. |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
IEEE |
en_US |
dc.relation.uri |
https://ieeexplore.ieee.org/document/9906294 |
en_US |
dc.subject |
Sinhala |
en_US |
dc.subject |
NLP |
en_US |
dc.subject |
Real-word errors |
en_US |
dc.subject |
Spell checker |
en_US |
dc.title |
Erroff: a tool to identify and correct real-word errors in sinhala documents |
en_US |
dc.type |
Conference-Full-text |
en_US |
dc.identifier.faculty |
Engineering |
en_US |
dc.identifier.department |
Engineering Research Unit, University of Moratuwa |
en_US |
dc.identifier.year |
2022 |
en_US |
dc.identifier.conference |
Moratuwa Engineering Research Conference 2022 |
en_US |
dc.identifier.place |
Moratuwa, Sri Lanka |
en_US |
dc.identifier.proceeding |
Proceedings of Moratuwa Engineering Research Conference 2022 |
en_US |
dc.identifier.email |
sudeshdilshan.17@cse.mrt.ac.lk |
|
dc.identifier.email |
dashinthadilan.17@cse.mrt.ac.lk |
|
dc.identifier.email |
rashnanayakkara.17@cse.mrt.ac.lk |
|
dc.identifier.email |
gihan@uom.lk |
|
dc.identifier.doi |
10.1109/MERCon55799.2022.9906294 |
en_US |