Multi-domain neural machine translation with knowledge distillation for low resource languages

dc.contributor.advisorDe Silva, N
dc.contributor.advisorRanathunga, S
dc.contributor.authorMenan, V
dc.date.accept2025
dc.date.accessioned2025-11-21T06:14:23Z
dc.date.issued2025
dc.description.abstractMulti-domainadaptationinNeuralMachineTranslation(NMT)iscrucialforensuring high-quality translations across diverse domains. Traditional fine-tuning approaches, while effective, become impractical as the number of domains increases, leading to high computational costs, space complexity, and catastrophic forgetting. Knowledge Distillation (KD) offers a scalable alternative by training a compact model using dis- tilled data from a larger teacher model. However, we hypothesize that sequence-level KD primarily distills the decoder while neglecting encoder knowledge transfer, result- inginsuboptimaladaptationandgeneralization, particularlyinlow-resourcelanguage settings where both data and computational resources are constrained. To address this, we propose an improved sequence-level distillation framework enhanced with encoder alignment using cosine similarity-based loss. Our approach ensures that the student model captures both encoder and decoder knowledge, miti- gating the limitations of conventional KD. We evaluate our method on multi-domain German–English translation under simulated low-resource conditions and further ex- tend the evaluation to a bona fide low-resource language, demonstrating the method’s robustness across diverse data conditions. Results demonstrate that our proposed encoder-aligned student model can even outperform its larger teacher models, achieving strong generalization across domains. Additionally, our method enables efficient domain adaptation when fine-tuned on new domains, surpassing existing KD-based approaches. These findings establish encoder alignment as a crucial component for effective knowledge transfer in multi-domain NMT, with significant implications for scalable and resource-efficient domain adapta- tion.
dc.identifier.accnoTH5868
dc.identifier.citationMenan, V. (2025). Multi-domain neural machine translation with knowledge distillation for low resource languages [Master’s theses, University of Moratuwa]. Institutional Repository University of Moratuwa. https://dl.lib.uom.lk/handle/123/24433
dc.identifier.degreeMSc (Major Component Research)
dc.identifier.departmentDepartment of Civil Engineering
dc.identifier.facultyEngineering
dc.identifier.urihttps://dl.lib.uom.lk/handle/123/24433
dc.language.isoen
dc.subjectNEURAL MACHINE TRANSLATION-Multi-Domain Adaptation
dc.subjectKNOWLEDGE DISTILLATION-Sequence-Level Distillation
dc.subjectHUMAN LANGUAGES-Low-Resource Languages
dc.subjectNATURAL LANGUAGE PROCESSING-Encoder Alignment
dc.subjectMSC (MAJOR COMPONENT RESEARCH)-Dissertation
dc.subjectCIVIL ENGINEERING-Dissertation
dc.subjectMSc (Major Component Research)
dc.titleMulti-domain neural machine translation with knowledge distillation for low resource languages
dc.typeThesis-Full-text

Files

Original bundle

Now showing 1 - 3 of 3
Loading...
Thumbnail Image
Name:
TH5868-1.pdf
Size:
549.95 KB
Format:
Adobe Portable Document Format
Description:
Pre-text
Loading...
Thumbnail Image
Name:
TH5868-2.pdf
Size:
103.35 KB
Format:
Adobe Portable Document Format
Description:
Post-text
Loading...
Thumbnail Image
Name:
TH5868.pdf
Size:
962.96 KB
Format:
Adobe Portable Document Format
Description:
Full-thesis

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: