Show simple item record

dc.contributor.author Sarveswaran, K
dc.contributor.author Dias, G
dc.contributor.author Butt, M
dc.contributor.editor Wijesiriwardana, CP
dc.date.accessioned 2022-12-05T05:40:51Z
dc.date.available 2022-12-05T05:40:51Z
dc.date.issued 2018
dc.identifier.citation K. Sarveswaran, G. Dias and M. Butt, "ThamizhiFST: A Morphological Analyser and Generator for Tamil Verbs," 2018 3rd International Conference on Information Technology Research (ICITR), 2018, pp. 1-6, doi: 10.1109/ICITR.2018.8736139. en_US
dc.identifier.uri http://dl.lib.uom.lk/handle/123/19645
dc.description.abstract ThamizhiFST is a Morphological Analyser and Generator (MAG) for Tamil. It was developed to extend the coverage of the computational Tamil grammar being developed using Lexical Functional Grammar (LFG). ThamizhiFST covers the simple verbs in Tamil as an initial step. A Finite State Transducer (FST) approach was used to develop the MAG and it was implemented using the FOMA Open Source Software. Since morphological rules are of a finite nature and represent a known quantity, a rule-based approach like FST is more appropriate than possible machine learning alternatives, especially with respect to achieving reliably good accuracy that is required for computational grammar development. A set of 3250 Tamil verb lemmas from 13 paradigms together with their 260 conjugation forms were used in the construction of ThamizhiFST. Further, a set of 27 labels were used to mark the morphosyntactic information of the verbs. The whole system was developed as a three-layer web-based system to tackle the issues arising when processing an agglutinative language like Tamil and to ensure its extendability. Unlike other existing MAGs, ThamizhiFST also provides the morpheme corresponding to each morphosyntactic label and marks morpheme boundaries. An evaluation shows that ThamizhiFST has an f-measure of 0.97 for simple verbs. Future and current work include work on extending the system to cover more verbs and nouns and make it generally available. en_US
dc.language.iso en en_US
dc.publisher Information Technology Research Unit, Faculty of Information Technology, University of Moratuwa, Sri Lanka en_US
dc.relation.uri https://ieeexplore.ieee.org/document/8736139 en_US
dc.subject Morphological analyser en_US
dc.subject Morphological generator en_US
dc.subject Finite state transducer en_US
dc.subject TamilM en_US
dc.title Thamizhifst: a morphological analyser and generator for tamil verbs en_US
dc.type Conference-Full-text en_US
dc.identifier.faculty IT en_US
dc.identifier.department Information Technology Research Unit, Faculty of Information Technology, University of Moratuwa. en_US
dc.identifier.year 2018 en_US
dc.identifier.conference 3rd International Conference on Information Technology Research 2018 en_US
dc.identifier.proceeding Proceedings of the 3rd International Conference in Information Technology Research 2018 en_US
dc.identifier.email sarvesk@uom.lk en_US
dc.identifier.email gihan@uom.lk en_US
dc.identifier.email miriam.butt@uni-konstanz.de en_US
dc.identifier.doi doi: 10.1109/ICITR.2018.8736139 en_US


Files in this item

This item appears in the following Collection(s)

  • ICITR - 2018 [34]
    International Conference on Information Technology Research (ICITR)

Show simple item record