Show simple item record

dc.contributor.advisor Dias,G
dc.contributor.author Fernando, SC
dc.date.accessioned 2011-03-30T05:09:38Z
dc.date.available 2011-03-30T05:09:38Z
dc.identifier.citation Fernando, S.C. (2007). Inexact matching of proper names in Sinhala [Master's theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.mrt.ac.lk/handle/123/666
dc.identifier.uri http://dl.lib.mrt.ac.lk/handle/123/666
dc.description CD-Rom included ; A Dissertation submitted to the Department of Computer Science and Engineering for the MSc in Computer Science specializing in Software Architecture en_US
dc.description.abstract With the advancement of technology, the need for maintaining national data and information becomes important. Most of these data and information have to be maintained in the local languages because majority of the Sri Lankans are still not very conversant in English. Therefore when public organizations embrace IT, their data including personal data has to be maintained in local languages. When data and information are available in the local language, searching and retrieving them using the local language become essential./ Proper nouns have an inherent problem because a given proper noun, for example a name can be spelt in several different ways. This problem becomes more prominent when a name from one language origin is spelt using another language. For example, the Sinhala name S®dg)d can be spelt in several ways such as Se&azsto, B&odo or Sg»26>3 using Sinhala itself. Therefore, one who would search an information store for a proper name may not encounter a match, if a different spelling is used to search from that being stored./ This research was to provide a solution to the problem mentioned above using Sinhala language. That is to build a rule based search application that would take a Sinhala input string, search an information store and retrieve matching results even if they were stored with a different spelling. This was achieved by building a rule base to replace characters of a key word with different characters in order to generate a set of words with different spelling. Then this set of words is searched in the information store and results are displayed. Rules were organized in different levels so that the user can select the level of charactcr replacement, thus it would retrieve matches with a slight spelling difference or retrieve matches with drastic spelling differences. /A special rule set was built for matching Tamil names written in Sinhala. The user has option to independently enable/disable this rule set. An application, which uses a general-purpose rule engine to process rules was designed and implemented to demonstrate this technology. This application consist of a web based user interface and a sample database as the information store. This was designed in a layered architecture such that future expansions and component reuse can be done. All character replacement rules are declared in text files, so changes and updates to the rule base can be done without modifying the system./ It is shown that the application, with the rule base that was built, will provide a solution to the proper name search problem stated above. This system can be integrated with future information systems in government and business organisations.
dc.format.extent ix, 64p. : ill. en_US
dc.language.iso en en_US
dc.subject COMPUTER SCIENCE - Dissertation
dc.subject COMPUTER SCIENCE AND ENGINEERING - Dissertation
dc.subject INFORMATION TECHNOLOGY - Sri Lanka
dc.subject INFORMATIONRETRIEVAL - Local Languages
dc.subject INFORMATION SYSTEMS - Local Languages
dc.subject INFORMATION SYSTEMS - Data Processing
dc.subject SINHALESE LANGUAGE
dc.title Inexact matching of proper names in Sinhala
dc.type Thesis-Abstract
dc.identifier.faculty Engineering en_US
dc.identifier.degree MSc en_US
dc.identifier.department Department of Computer science &Engineering en_US
dc.date.accept 2006-12
dc.identifier.accno 92293 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record