Using neural networks recognition of handwritten mathematical document

dc.contributor.authorLiyanage, NN
dc.contributor.authorLiyanage, NN
dc.date.accept2008
dc.date.accept2008
dc.date.accessioned2011-07-20T06:05:18Z
dc.date.available2011-07-20T06:05:18Z
dc.date.issued7/20/2011
dc.description.abstractAdvancements in modem technologies cannot still override the importance of preparation of handwritten documentations. In particular, handwritten documentations are inevitable in mathematical calculations, mathematical tutorials, preparation of marking schemes and financial reports. Despite handheld computing devices such as PDAs provide facilities for handwritten data entry, they are unable to process pre-constructed handwritten mathematical documents. This project presents an approach to the design and implementation of an Artificial Neural Network (ANN) solution for recognition of pre developed handwritten mathematical documents and production of the output as a text file. The system consists of three modules for image processing, character recognition and text formation. The Image processing module has been designed to capture the features of handwritten characters and to produce quality inputs for the ANN module. This module can accept scanned mathematical documents in the digitized form and scale down them into 48x48 pixels ensuring that quality of the images have not been affected. Having received the digitized documents, the module follows through a process of thresholding, normalization, segmentation and feature extraction of the numeric characters and the mathematical symbols in the document. The entire image processing module has been developed on the Visual Basic version 6.0 that provides a standard toolkit for image processing applications. The Artificial Neural Network module of the system has been designed with three layer architecture to use Back propagation training algorithm. The architecture of the ANN comprises an input layer, one single hidden layer and an output layer. As per the 48x48 pixel input, the input layer has been designed with 2304 (48x48) neurons. The output layer has 22 neurons to refers to 10 numeric characters and 12 symbols. Eight hundred and two hundred images have been used for training and testing respectively. The maximum error handled by the ANN has been setout as 0.00015. This module has been designed and developed with the use of NeuroSolution to identify 10 digits and other mathematical symbols such as +, -, /, x, =, <, >, =:; ~ ~ ,( and). Among other ANN development environments, NeuroSolution provides not only a standard toolkit for training and testing of ANN, but also built-in facility for handling image recognition applications. In addition NeuroSulition provide DLL for linking with Visual Basic and C++ thereby allowing the development of integrated applications Output generation module and the system integration have been developed using Visual Basic 6.0. The output generator module has been designed to write the output recognized by the trained network into a text file. This module has been developed to highlight the numerical characters or mathematical symbols which are identified with some ambiguities. Such ambiguities can be removed through a process of post-editing of the generated output document by a relevant person. The prototypes developed in this project have been trained to identify digits: 0,1,2,3 and symbols + and = by considering 800 images. Testing with 200 images has recorded 95% accuracy in the recognition of the images by the trained ANN. The network can be extended to train other mathematical symbols and the accuracy can be improved by introducing more input images. This system can be run on a standard Pc. Keywords- Image processing, Artificial Neural Networks, Handwritten character recognition, Backpropagation, Supervised Learningen_US
dc.description.abstractAdvancements in modem technologies cannot still override the importance of preparation of handwritten documentations. In particular, handwritten documentations are inevitable in mathematical calculations, mathematical tutorials, preparation of marking schemes and financial reports. Despite handheld computing devices such as PDAs provide facilities for handwritten data entry, they are unable to process pre-constructed handwritten mathematical documents. This project presents an approach to the design and implementation of an Artificial Neural Network (ANN) solution for recognition of pre developed handwritten mathematical documents and production of the output as a text file. The system consists of three modules for image processing, character recognition and text formation. The Image processing module has been designed to capture the features of handwritten characters and to produce quality inputs for the ANN module. This module can accept scanned mathematical documents in the digitized form and scale down them into 48x48 pixels ensuring that quality of the images have not been affected. Having received the digitized documents, the module follows through a process of thresholding, normalization, segmentation and feature extraction of the numeric characters and the mathematical symbols in the document. The entire image processing module has been developed on the Visual Basic version 6.0 that provides a standard toolkit for image processing applications. The Artificial Neural Network module of the system has been designed with three layer architecture to use Back propagation training algorithm. The architecture of the ANN comprises an input layer, one single hidden layer and an output layer. As per the 48x48 pixel input, the input layer has been designed with 2304 (48x48) neurons. The output layer has 22 neurons to refers to 10 numeric characters and 12 symbols. Eight hundred and two hundred images have been used for training and testing respectively. The maximum error handled by the ANN has been setout as 0.00015. This module has been designed and developed with the use of NeuroSolution to identify 10 digits and other mathematical symbols such as +, -, /, x, =, <, >, =:; ~ ~ ,( and). Among other ANN development environments, NeuroSolution provides not only a standard toolkit for training and testing of ANN, but also built-in facility for handling image recognition applications. In addition NeuroSulition provide DLL for linking with Visual Basic and C++ thereby allowing the development of integrated applications Output generation module and the system integration have been developed using Visual Basic 6.0. The output generator module has been designed to write the output recognized by the trained network into a text file. This module has been developed to highlight the numerical characters or mathematical symbols which are identified with some ambiguities. Such ambiguities can be removed through a process of post-editing of the generated output document by a relevant person. The prototypes developed in this project have been trained to identify digits: 0,1,2,3 and symbols + and = by considering 800 images. Testing with 200 images has recorded 95% accuracy in the recognition of the images by the trained ANN. The network can be extended to train other mathematical symbols and the accuracy can be improved by introducing more input images. This system can be run on a standard Pc. Keywords- Image processing, Artificial Neural Networks, Handwritten character recognition, Backpropagation, Supervised Learning
dc.identifier.accno92991en_US
dc.identifier.degreeMScen_US
dc.identifier.departmentFaculty of lnformation Technologyen_US
dc.identifier.facultyITen_US
dc.identifier.urihttp://dl.lib.mrt.ac.lk/theses/handle/123/1771
dc.language.isoenen_US
dc.subjectINFORMATION TECHNOLOGY-Dissertation ; COMPUTER AND INFORMATION SCIENCE-Dissertation ; IMAGE PROCESSING ; IMAGE PROCESSING-DIGITAL TECHNIQUES ; IMAGE PROCESSING, COMPUTER VISION, PATTERN RECOGNITION, AND GRAPHICS ; ARTIFICIAL NEURAL NETWORKS ; CHARACTER RECOGNITION DEVICES, OPTICAL ; BACKPROPAGATION (ARTIFICIAL INTELLIGENCE) ; SUPERVISED LEARNING (MACHINE LEARNING) ; MACHINE LEARNING ; MACHINE LEARNING-TECHNIQUEen_US
dc.titleUsing neural networks recognition of handwritten mathematical documenten_US
dc.typeThesis-Abstract

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
92991.pdf
Size:
160.39 KB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
092991.pdf
Size:
12.48 MB
Format:
Adobe Portable Document Format
Description:
full-text

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: