dc.contributor.author |
Farhath, F |
|
dc.contributor.author |
Ranathunga, S |
|
dc.contributor.author |
Jayasena, S |
|
dc.contributor.author |
Dias, G |
|
dc.contributor.editor |
Chathuranga, D |
|
dc.date.accessioned |
2022-08-16T04:34:59Z |
|
dc.date.available |
2022-08-16T04:34:59Z |
|
dc.date.issued |
2018-05 |
|
dc.identifier.citation |
F. Farhath, S. Ranathunga, S. Jayasena and G. Dias, "Integration of Bilingual Lists for Domain-Specific Statistical Machine Translation for Sinhala-Tamil," 2018 Moratuwa Engineering Research Conference (MERCon), 2018, pp. 538-543, doi: 10.1109/MERCon.2018.8421901. |
en_US |
dc.identifier.uri |
http://dl.lib.uom.lk/handle/123/18646 |
|
dc.description.abstract |
Availability of quality parallel data is a major
requirement to build a reasonably well performing statistical
machine translation (SMT) system. Thus, developing a decent
SMT system for a low-resourced language pair like Sinhala and
Tamil that does not have a large parallel corpus is rather
challenging. Past research for other different language pairs has
shown that different terminology / bilingual list integration
methodologies can be used to improve the quality of SMT
systems, for domain-specific SMT in particular. In this paper, we
explore if this can be effective for Sinhala-Tamil machine
translation for the domain of official government documents. We
evaluate the impact of three types of bilingual lists, namely, a list
of government organizations and official designations, a glossary
related to government administration and operations, and a
general bilingual dictionary, based on four different
methodologies (three static and one dynamic). Out of four, one
methodology gave notable improvements for all three types of list
over the baseline. |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
IEEE |
en_US |
dc.relation.uri |
https://ieeexplore.ieee.org/document/8421901 |
en_US |
dc.subject |
statistical machine translation |
en_US |
dc.subject |
Sinhala, Tamil |
en_US |
dc.subject |
low-resourced |
en_US |
dc.subject |
terminology integration |
en_US |
dc.title |
Integration of bilingual lists for domain-specific statistical machine translation for sinhala-tamil |
en_US |
dc.type |
Conference-Full-text |
en_US |
dc.identifier.faculty |
Engineering |
en_US |
dc.identifier.department |
Engineering Research Unit, University of Moratuwa |
en_US |
dc.identifier.year |
2018 |
en_US |
dc.identifier.conference |
2018 Moratuwa Engineering Research Conference (MERCon) |
en_US |
dc.identifier.pgnos |
pp. 538-543 |
en_US |
dc.identifier.proceeding |
Proceedings of 2018 Moratuwa Engineering Research Conference (MERCon) |
en_US |
dc.identifier.email |
fathimafarhath@cse.mrt.ac.lk |
en_US |
dc.identifier.email |
surangika@cse.mrt.ac.lk |
en_US |
dc.identifier.email |
sanath@cse.mrt.ac.lk |
en_US |
dc.identifier.email |
gihan@cse.mrt.ac.lk |
en_US |
dc.identifier.doi |
10.1109/MERCon.2018.8421901 |
en_US |