Neural machine translation approach for Singlish to English translation

dc.contributor.advisorFernando S
dc.contributor.advisorSumathipala S
dc.contributor.authorSandaruwan HGD
dc.date.accept2021
dc.date.accessioned2021
dc.date.available2021
dc.date.issued2021
dc.description.abstractThis dissertation is for a research that aimed at proposing a language model to translate texts written in Singlish to English. Singlish is an alternative writing system for Sinhala language that uses Latin scripts (English Alphabet) instead of using native Sinhala alphabet. This had been a requirement for long period, since many Sri Lankans use this writing method to write product reviews, social media posts and comments etc. This has been tried since couple of years by many research students but the main challenge was to find a proper data set to evaluate deep learning models for this Natural Language Processing (NLP) task. Hence, traditional statistic, rulebased models has been proposed with less data. This research addresses the challenge of preparing a data set to evaluate a deep learning approach for this machine translation activity and also to evaluate a seq2seq Neural Machine Translation (NMT) model. The proposed seq2seq model is purely based on the attention mechanism, as it has been used to improve NMT by selectively focusing on parts of the source sentence during translation. The proposed approach can achieve 24.13 BLEU score on Singlish-English by seeing ~0.15 M parallel sentence pairs with ~50 K word vocabulary.en_US
dc.identifier.accnoTH5006en_US
dc.identifier.citationSandaruwan, H.G.D. (2021). Neural machine translation approach for Singlish to English translation [Master's theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.uom.lk/handle/123/21470
dc.identifier.degreeMSc in Artificial Intelligenceen_US
dc.identifier.departmentDepartment of Computational Mathematicsen_US
dc.identifier.facultyITen_US
dc.identifier.urihttp://dl.lib.uom.lk/handle/123/21470
dc.language.isoenen_US
dc.subjectSINGLISHen_US
dc.subjectNMTen_US
dc.subjectLANGUAGE PROCESSINGen_US
dc.subjectSEQ2SEQen_US
dc.subjectATTENTION MODELen_US
dc.subjectWORD EMBEDDINGen_US
dc.subjectINFORMATION TECHNOLOGY -Dissertationen_US
dc.subjectCOMPUTATIONAL MATHEMATICS -Dissertationen_US
dc.subjectARTIFICIAL INTELLIGENCE -Dissertationen_US
dc.titleNeural machine translation approach for Singlish to English translationen_US
dc.typeThesis-Abstracten_US

Files

Original bundle

Now showing 1 - 3 of 3
Loading...
Thumbnail Image
Name:
TH5006-1.pdf
Size:
200.7 KB
Format:
Adobe Portable Document Format
Description:
Pre-Text
Loading...
Thumbnail Image
Name:
TH5006-2.pdf
Size:
139.66 KB
Format:
Adobe Portable Document Format
Description:
Post-Text
Loading...
Thumbnail Image
Name:
TH5006.pdf
Size:
1000.04 KB
Format:
Adobe Portable Document Format
Description:
Full-theses

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: