Using multi agent technology for automatic machine translation

Hettige B

UoM IR
→
Thesis & Dissertation
→
Faculty of IT, Computational Mathematics
→
Doctor of Philosophy (Ph.D.)
→
View Item

dc.contributor.advisor	Karunananda AS
dc.contributor.advisor	Rzevski G
dc.contributor.author	Hettige B
dc.date.accessioned	2020
dc.date.available	2020
dc.date.issued	2020
dc.identifier.citation	Hettige, B. (2020). Using multi agent technology for automatic machine translation [Doctoral dissertation, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.uom.lk/handle/123/16916.
dc.identifier.uri	http://dl.lib.uom.lk/handle/123/16916
dc.description.abstract	Machine translation is a cost-effective, quick, and widely accepted automated language translation method that has become essential in the modern and ever more globalized world. Machine translation can be done with one or more different approaches, including dictionary-based, rule-based, example-based, phrase-based, statistical, or neural-linguistic approaches. Nevertheless, most of the existing machine translation systems show a quality gap when compared with human translation. Thus, human translation has been considered as the best language translation method sofar. Human language translation is a complex and opportunistic process depends on human memory. This human language translation process has been described through a few theories. Among them, the garden path model and the constraint satisfaction model are two fundamental approaches available for human language translation, especially concerning sentence parsing with meaning. These two theoretical models demonstrate how to select suitable words in the phrase of a sentence to generate accepted meanings. Based on these two theories, a hybrid approach to machine translation has been proposed. This proposed approach is stimulated by how people parse and translate a sentence by putting available phrases together with accepted meaning. According to the approach, translation is done in three stages. In the first stage, the system analyses the given sentence by considering the morphology, syntax, and semantics of the source language. Then, the system uses phrase-based translation and translates each phrase into the target with multiple solutions. The phrase translation is done considering the four factors of psycholinguistic parsing techniques, such as phrase structure, semantic features, thematic roles, and probability. Finally, considering all the translated phrases, the system should be capable of identifying suitable target language phrases to take accepted meanings, considering subject-verb and object-verb agreements. After the subject-verb-object agreement, other available phrases in the sentence should be capable of re-arranging according to the accepted subject, object, and verb phrases. This approach has been simulated with the multi-agent system named EnSiMaS, which translates English text into Sinhala. The EnSiMaS was implemented on the MaSMT framework, which was specially developed for agent-based machine translation. The EnSiMaS comprises of 26 language processing agents on both source and target languages. These agents were clustered into six agent swarms considering morphological, syntactical, and semantical concerns of the source and the target languages. In addition to these language-processing agents, the system should be able to create an agent dynamically for each source language phrase. These dynamically created phrase agents should be capable of communicating with other relevant phrases and taking the accepted solutions. The EnSiMaS was tested with 85 sample English sentences. For each English sentence, three different translations were taken. According to the evaluation result, the system shows an 8.77% word error rate, a 6.72% inflexion error rate, and a 5.37% sentence error rate for the first, second, and third translations. In addition, calculated BLUE scores show 0.89160756, 0.52009204, and 0.43581893 for the first, second, and third translations. Then randomly selected 25 samples sentences are used to calculate the adequacy and fluency of the EnSiMaS. Adequacy and fluency rates were taken from 55 human evaluators considering the human-translated reference sentences. The Kendal’s Tau correlation coefficient shows that there is a weak positive association between adequacy levels of human translations vs EnSiMaS system translations and moderate positive association between fluency levels of human translation and EnSiMaS system translation. Further, according to the Fleiss Kappa coefficient method, there is a significant fair agreement on raters for adequacy and fluency ratings	en_US
dc.language.iso	en	en_US
dc.subject	COMPUTATIONAL MATHEMATICS-Dissertations	en_US
dc.subject	MACHINE TRANSLATION	en_US
dc.subject	MULTI-AGENT SYSTEMS	en_US
dc.subject	HUMAN LANGUAGE PROCESSING	en_US
dc.subject	MULTI-AGENT SYSTEM FOR MACHINE TRANSLATION	en_US
dc.subject	ENGLISH TO SINHALA MULTI AGENT SYSTEM	en_US
dc.subject	MaSMT	en_US
dc.subject	EnSiMas	en_US
dc.title	Using multi agent technology for automatic machine translation	en_US
dc.type	Thesis-Full-text	en_US
dc.identifier.faculty	IT	en_US
dc.identifier.degree	Doctor of Philosophy	en_US
dc.identifier.department	Department of Computational Mathematics	en_US
dc.date.accept	2020
dc.identifier.accno	TH4442	en_US