Institutional-Repository, University of Moratuwa.  

A Hybrid approach to natural language machine translation for Sinhala & English

Show simple item record

dc.contributor.advisor Karunananda, AS Gray, HP 2012-03-16T11:52:56Z 2012-03-16T11:52:56Z
dc.description.abstract Machine Translation is one of the least achieved areas in the area of natural language processing. This is because natural languages are complex, a word can have several meanings, a sentence can have several translations and the translation of a sentence may depend on the context. In this report we describe an approach to machine translation for Sinhala and English languages. We postulate that humans are able to translate natural languages through simple rules and experience collected without being knowledgeable about sophisticated language construction such as morphology, syntax, semantics and pragmatic structures. This hypothesis has been inspired by the fact that humans construct word forms, phrases and sentences with new words they learn by using simple rules without even being fully conscious about the rules. We do not ignore the fact that all words in a vocabulary do not follow the same rules for forming words. Humans use specific knowledge about certain words when they construct sentences. Also the word selection in a translated sentence varies depending on the context or the semantics of the sentence. Due to this complexity, we focus on a hybrid approach which uses both rules and statistics. The system described in this thesis focuses on modeling the steps taken by a human to translate a sentence from one language to the other. A bilingual dictionary is used to modal the knowledge of words and synonyms in both languages. Exceptional word dictionaries are used as equivalents to the knowledge of the special words which do not follow the common rules of morphology. The language parsers handle the syntax of sentences in either language. Morphology analyzers are used to handle the rules used in constructing word forms while statistical analyzers are used to handle the proper word usage depending on the syntax. The system was evaluated by comparing human translation with the machine translation output. The two dominating factors considered were, how understandable the translated sentence is and how much information the translated sentence retains compared to the original. The results are up to the expected quality and further work is required to improve the semantics of translation.
dc.language.iso en_US en_US
dc.title A Hybrid approach to natural language machine translation for Sinhala & English
dc.identifier.faculty Engineering en_US MSc en_US
dc.identifier.department Department of Computer Science & Engineering en_US 2010
dc.identifier.accno 96407 en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record