Multi-domain neural machine translation with knowledge distillation for low resource languages
| dc.contributor.advisor | De Silva, N | |
| dc.contributor.advisor | Ranathunga, S | |
| dc.contributor.author | Menan, V | |
| dc.date.accept | 2025 | |
| dc.date.accessioned | 2025-11-21T06:14:23Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | Multi-domainadaptationinNeuralMachineTranslation(NMT)iscrucialforensuring high-quality translations across diverse domains. Traditional fine-tuning approaches, while effective, become impractical as the number of domains increases, leading to high computational costs, space complexity, and catastrophic forgetting. Knowledge Distillation (KD) offers a scalable alternative by training a compact model using dis- tilled data from a larger teacher model. However, we hypothesize that sequence-level KD primarily distills the decoder while neglecting encoder knowledge transfer, result- inginsuboptimaladaptationandgeneralization, particularlyinlow-resourcelanguage settings where both data and computational resources are constrained. To address this, we propose an improved sequence-level distillation framework enhanced with encoder alignment using cosine similarity-based loss. Our approach ensures that the student model captures both encoder and decoder knowledge, miti- gating the limitations of conventional KD. We evaluate our method on multi-domain German–English translation under simulated low-resource conditions and further ex- tend the evaluation to a bona fide low-resource language, demonstrating the method’s robustness across diverse data conditions. Results demonstrate that our proposed encoder-aligned student model can even outperform its larger teacher models, achieving strong generalization across domains. Additionally, our method enables efficient domain adaptation when fine-tuned on new domains, surpassing existing KD-based approaches. These findings establish encoder alignment as a crucial component for effective knowledge transfer in multi-domain NMT, with significant implications for scalable and resource-efficient domain adapta- tion. | |
| dc.identifier.accno | TH5868 | |
| dc.identifier.citation | Menan, V. (2025). Multi-domain neural machine translation with knowledge distillation for low resource languages [Master’s theses, University of Moratuwa]. Institutional Repository University of Moratuwa. https://dl.lib.uom.lk/handle/123/24433 | |
| dc.identifier.degree | MSc (Major Component Research) | |
| dc.identifier.department | Department of Civil Engineering | |
| dc.identifier.faculty | Engineering | |
| dc.identifier.uri | https://dl.lib.uom.lk/handle/123/24433 | |
| dc.language.iso | en | |
| dc.subject | NEURAL MACHINE TRANSLATION-Multi-Domain Adaptation | |
| dc.subject | KNOWLEDGE DISTILLATION-Sequence-Level Distillation | |
| dc.subject | HUMAN LANGUAGES-Low-Resource Languages | |
| dc.subject | NATURAL LANGUAGE PROCESSING-Encoder Alignment | |
| dc.subject | MSC (MAJOR COMPONENT RESEARCH)-Dissertation | |
| dc.subject | CIVIL ENGINEERING-Dissertation | |
| dc.subject | MSc (Major Component Research) | |
| dc.title | Multi-domain neural machine translation with knowledge distillation for low resource languages | |
| dc.type | Thesis-Full-text |
Files
Original bundle
1 - 3 of 3
Loading...
- Name:
- TH5868-1.pdf
- Size:
- 549.95 KB
- Format:
- Adobe Portable Document Format
- Description:
- Pre-text
Loading...
- Name:
- TH5868-2.pdf
- Size:
- 103.35 KB
- Format:
- Adobe Portable Document Format
- Description:
- Post-text
Loading...
- Name:
- TH5868.pdf
- Size:
- 962.96 KB
- Format:
- Adobe Portable Document Format
- Description:
- Full-thesis
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description:
