Fine-grained named entity recognition for sinhala

Loading...
Thumbnail Image

Date

2020

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Abstract

For English, Named Entity Recognition (NER) is more or less a solved problem. However, for low-resourced and morphologically rich languages such as Sinhala, minimal research has been done. In this paper, we present a novel fine-grained Named Entity (NE) tag set and an NE annotated Sinhala corpus of 70k word tokens. We trained a custom NER model for Sinhala based on Conditional Random Fields (CRF). Despite the low-resourced setting, this NER model could achieve an micro-averaged F1 score of 84.8.

Description

Citation

R. Azeez and S. Ranathunga, "Fine-Grained Named Entity Recognition for Sinhala," 2020 Moratuwa Engineering Research Conference (MERCon), 2020, pp. 295-300, doi: 10.1109/MERCon50084.2020.9185296.

Collections

Endorsement

Review

Supplemented By

Referenced By