Fine-tuning self-supervised multilingual sequence-to-sequence models for extremely low-resource nmt

Thillainathan, S; Ranathunga, S; Jayasena, S

Fine-tuning self-supervised multilingual sequence-to-sequence models for extremely low-resource nmt

dc.contributor.author	Thillainathan, S
dc.contributor.author	Ranathunga, S
dc.contributor.author	Jayasena, S
dc.contributor.editor	Adhikariwatte, W
dc.contributor.editor	Rathnayake, M
dc.contributor.editor	Hemachandra, K
dc.date.accessioned	2022-10-20T04:15:20Z
dc.date.available	2022-10-20T04:15:20Z
dc.date.issued	2021-07
dc.description.abstract	Neural Machine Translation (NMT) tends to perform poorly in low-resource language settings due to the scarcity of parallel data. Instead of relying on inadequate parallel corpora, we can take advantage of monolingual data available in abundance. Training a denoising self-supervised multilingual sequence-to-sequence model by noising the available large scale monolingual corpora is one way to utilize monolingual data. For a pair of languages for which monolingual data is available in such a pre-trained multilingual denoising model, the model can be fine-tuned with a smaller amount of parallel data from this language pair. This paper presents fine-tuning self-supervised multilingual sequence-to-sequence pre-trained models for extremely low-resource domain-specific NMT settings. We choose one such pre-trained model: mBART. We are the first to implement and demonstrate the viability of non-English centric complete fine-tuning on multilingual sequence-to-sequence pre-trained models. We select Sinhala, Tamil and English languages to demonstrate fine-tuning on extremely low-resource settings in the domain of official government documents. Experiments show that our fine-tuned mBART model significantly outperforms state-of-the-art Transformer based NMT models in all pairs in all six bilingual directions, where we report a 4.41 BLEU score increase on Tamil→Sinhala and a 2.85 BLUE increase on Sinhala→ Tamil translation.	en_US
dc.identifier.citation	S. Thillainathan, S. Ranathunga and S. Jayasena, "Fine-Tuning Self-Supervised Multilingual Sequence-To-Sequence Models for Extremely Low-Resource NMT," 2021 Moratuwa Engineering Research Conference (MERCon), 2021, pp. 432-437, doi: 10.1109/MERCon52712.2021.9525720.	en_US
dc.identifier.conference	Moratuwa Engineering Research Conference 2021	en_US
dc.identifier.department	Engineering Research Unit, University of Moratuwa	en_US
dc.identifier.doi	10.1109/MERCon52712.2021.9525720	en_US
dc.identifier.faculty	Engineering	en_US
dc.identifier.pgnos	*****	en_US
dc.identifier.place	Moratuwa, Sri Lanka	en_US
dc.identifier.proceeding	Proceedings of Moratuwa Engineering Research Conference 2021	en_US
dc.identifier.uri	http://dl.lib.uom.lk/handle/123/19156
dc.identifier.year	2021	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE	en_US
dc.relation.uri	https://ieeexplore.ieee.org/document/9525720	en_US
dc.subject	neural machine translation	en_US
dc.subject	pre-trained models	en_US
dc.subject	fine-tuning	en_US
dc.subject	denoising autoencoder	en_US
dc.subject	low-resource languages	en_US
dc.title	Fine-tuning self-supervised multilingual sequence-to-sequence models for extremely low-resource nmt	en_US
dc.type	Conference-Full-text	en_US

Collections

MERCon - 2021

Fine-tuning self-supervised multilingual sequence-to-sequence models for extremely low-resource nmt

Files

Collections