Multi-domain neural machine translation with knowledge distillation for low resource languages

Loading...
Thumbnail Image

Date

2025

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Multi-domainadaptationinNeuralMachineTranslation(NMT)iscrucialforensuring high-quality translations across diverse domains. Traditional fine-tuning approaches, while effective, become impractical as the number of domains increases, leading to high computational costs, space complexity, and catastrophic forgetting. Knowledge Distillation (KD) offers a scalable alternative by training a compact model using dis- tilled data from a larger teacher model. However, we hypothesize that sequence-level KD primarily distills the decoder while neglecting encoder knowledge transfer, result- inginsuboptimaladaptationandgeneralization, particularlyinlow-resourcelanguage settings where both data and computational resources are constrained. To address this, we propose an improved sequence-level distillation framework enhanced with encoder alignment using cosine similarity-based loss. Our approach ensures that the student model captures both encoder and decoder knowledge, miti- gating the limitations of conventional KD. We evaluate our method on multi-domain German–English translation under simulated low-resource conditions and further ex- tend the evaluation to a bona fide low-resource language, demonstrating the method’s robustness across diverse data conditions. Results demonstrate that our proposed encoder-aligned student model can even outperform its larger teacher models, achieving strong generalization across domains. Additionally, our method enables efficient domain adaptation when fine-tuned on new domains, surpassing existing KD-based approaches. These findings establish encoder alignment as a crucial component for effective knowledge transfer in multi-domain NMT, with significant implications for scalable and resource-efficient domain adapta- tion.

Description

Citation

Menan, V. (2025). Multi-domain neural machine translation with knowledge distillation for low resource languages [Master’s theses, University of Moratuwa]. Institutional Repository University of Moratuwa. https://dl.lib.uom.lk/handle/123/24433

DOI

Endorsement

Review

Supplemented By

Referenced By