Fine tuning named entity extraction models for the fantasy domain
dc.contributor.author | Sivaganeshan, A | |
dc.contributor.author | Silva, ND | |
dc.contributor.editor | Abeysooriya, R | |
dc.contributor.editor | Adikariwattage, V | |
dc.contributor.editor | Hemachandra, K | |
dc.date.accessioned | 2024-03-14T05:07:56Z | |
dc.date.available | 2024-03-14T05:07:56Z | |
dc.date.issued | 2023-12-09 | |
dc.description.abstract | Named Entity Recognition (NER) is a sequence classification Natural Language Processing task where entities are identified in the text and classified into predefined categories. It acts as a foundation for most information extraction systems. Dungeons and Dragons (D&D) is an open-ended tabletop fantasy game with its own diverse lore. DnD entities are domain-specific and are thus unrecognizable by even the state-of-the-art offthe- shelf NER systems as the NER systems are trained on general data for pre-defined categories such as: person (PERS), location (LOC), organization (ORG), and miscellaneous (MISC). For meaningful extraction of information from fantasy text, the entities need to be classified into domain-specific entity categories as well as the models be fine-tuned on a domain-relevant corpus. This work uses available lore of monsters in the D&Ddomain to fine-tune Trankit, which is a prolific NER framework that uses a pre-trained model for NER. Upon this training, the system acquires the ability to extract monster names from relevant domain documents under a novel NER tag. This work compares the accuracy of the monster name identification against; the zero-shot Trankit model and two FLAIR models. The fine-tuned Trankit model achieves an 87.86% F1 score surpassing all the other considered models. | en_US |
dc.identifier.citation | A. Sivaganeshan and N. De Silva, "Fine Tuning Named Entity Extraction Models for the Fantasy Domain," 2023 Moratuwa Engineering Research Conference (MERCon), Moratuwa, Sri Lanka, 2023, pp. 346-351, doi: 10.1109/MERCon60487.2023.10355501. | en_US |
dc.identifier.conference | Moratuwa Engineering Research Conference 2023 | en_US |
dc.identifier.department | Engineering Research Unit, University of Moratuwa | en_US |
dc.identifier.email | sivaganeshan.22@cse.mrt.ac.lk | en_US |
dc.identifier.email | nisansadds@cse.mrt.ac.lk | en_US |
dc.identifier.faculty | Engineering | en_US |
dc.identifier.pgnos | pp. 346-351 | en_US |
dc.identifier.place | Katubedda | en_US |
dc.identifier.proceeding | Proceedings of Moratuwa Engineering Research Conference 2023 | en_US |
dc.identifier.uri | http://dl.lib.uom.lk/handle/123/22309 | |
dc.identifier.year | 2023 | en_US |
dc.language.iso | en | en_US |
dc.publisher | IEEE | en_US |
dc.relation.uri | https://ieeexplore.ieee.org/document/10355501/ | en_US |
dc.subject | Trankit | en_US |
dc.subject | Dungeons and dragons | en_US |
dc.subject | FLAIR | en_US |
dc.title | Fine tuning named entity extraction models for the fantasy domain | en_US |
dc.type | Conference-Full-text | en_US |