Fine-grained named entity recognition for sinhala

dc.contributor.authorAzeez, R
dc.contributor.authorRanathunga, S
dc.contributor.editorWeeraddana, C
dc.contributor.editorEdussooriya, CUS
dc.date.accessioned2022-08-09T06:38:14Z
dc.date.available2022-08-09T06:38:14Z
dc.date.issued2020
dc.description.abstractFor English, Named Entity Recognition (NER) is more or less a solved problem. However, for low-resourced and morphologically rich languages such as Sinhala, minimal research has been done. In this paper, we present a novel fine-grained Named Entity (NE) tag set and an NE annotated Sinhala corpus of 70k word tokens. We trained a custom NER model for Sinhala based on Conditional Random Fields (CRF). Despite the low-resourced setting, this NER model could achieve an micro-averaged F1 score of 84.8.en_US
dc.identifier.citationR. Azeez and S. Ranathunga, "Fine-Grained Named Entity Recognition for Sinhala," 2020 Moratuwa Engineering Research Conference (MERCon), 2020, pp. 295-300, doi: 10.1109/MERCon50084.2020.9185296.en_US
dc.identifier.conferenceMoratuwa Engineering Research Conference 2020en_US
dc.identifier.departmentEngineering Research Unit, University of Moratuwaen_US
dc.identifier.doi10.1109/MERCon50084.2020.9185296en_US
dc.identifier.emailrameelaa@uom.lken_US
dc.identifier.emailsurangika@cse.mrt.ac.lken_US
dc.identifier.facultyEngineeringen_US
dc.identifier.pgnospp. 295-300en_US
dc.identifier.placeMoratuwa, Sri Lankaen_US
dc.identifier.proceedingProceedings of Moratuwa Engineering Research Conference 2020en_US
dc.identifier.urihttp://dl.lib.uom.lk/handle/123/18576
dc.identifier.year2020en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.relation.urihttps://ieeexplore.ieee.org/document/9185296en_US
dc.subjectnamed entity recognitionen_US
dc.subjectsinhalaen_US
dc.subjectnamed entityen_US
dc.subjectconditional random fieldsen_US
dc.titleFine-grained named entity recognition for sinhalaen_US
dc.typeConference-Full-texten_US

Files

Collections