Investigating the learning progress of cnns in script identification using gradient values

dc.contributor.authorTomioka, E
dc.contributor.authorMorita, K
dc.contributor.authorShirai, NC
dc.contributor.authorWakabayashi, T
dc.contributor.authorOhyama, W
dc.contributor.editorSudantha, BH
dc.date.accessioned2022-11-18T04:26:52Z
dc.date.available2022-11-18T04:26:52Z
dc.date.issued2019-12
dc.description.abstractDemands for an automatic translation based on Camera-based Multilingual Optical Character Recognition (CM-OCR) are increasing. In addition, CM-OCR methods usually employ a script identification step before character recognition. Recent approaches for script identification depend on a Convolutional Neural Networks (CNN) thanks to its promising performance in the image recognition task. However, researchers mentioned the importance to understand the decision criteria in CNNs as a warning to employ them for actual tasks as black-box classifiers. Thus, the purpose of this research is to investigate the hyperparameter dependence of CNNs and to visualize the region focused by CNNs in the task of script identification. In this research, we applied Grad-CAM to the script identification task of image classification and used the SIW-13 dataset. We investigated the learning progress of CNNs by defining the value used in Grad-CAM as the "reaction" and visualized the region focused by CNNs in script identification. As a result, the learning process was stabilized in the case that the number of hyperparameters was sufficient for the given training samples even though the hyperparameters which should be tuned were increased. This result demonstrated that the capacity to stably learn training samples depends on the number of hyperparameters. In the insufficient capacity case, the learning process was destabilized and it caused scripts with relatively low accuracy. Analyzing one of the low accuracy scripts of the model using Grad-CAM, we found that some failures progress greatly changes by the difference in hyperparameters of CNNs. Scatter plots of the reaction and the probability clarified the capacity of CNNs in each script.en_US
dc.identifier.citationE. Tomioka, K. Morita, N. C. Shirai, T. Wakabayashi and W. Ohyama, "Investigating the Learning Progress of CNNs in Script Identification Using Gradient Values," 2019 4th International Conference on Information Technology Research (ICITR), 2019, pp. 1-6, doi: 10.1109/ICITR49409.2019.9407784.en_US
dc.identifier.conference4th International Conference in Information Technology Research 2019en_US
dc.identifier.departmentInformation Technology Research Unit, Faculty of Information Technology, University of Moratuwa.en_US
dc.identifier.doidoi: 10.1109/ICITR49409.2019.9407784en_US
dc.identifier.facultyITen_US
dc.identifier.placeColombo,Sri Lankaen_US
dc.identifier.proceedingProceedings of the 4th International Conference in Information Technology Research 2019en_US
dc.identifier.urihttp://dl.lib.uom.lk/handle/123/19562
dc.identifier.year2019en_US
dc.language.isoenen_US
dc.publisherInformation Technology Research Unit, Faculty of Information Technology, University of Moratuwa, Sri Lankaen_US
dc.relation.urihttps://ieeexplore.ieee.org/document/9407784en_US
dc.subjectCM-OCRen_US
dc.subjectDeep Learningen_US
dc.subjectConvolutional Neural Networken_US
dc.subjectGrad-CAMen_US
dc.subjectScene Text Scripten_US
dc.subjectScript Identificationen_US
dc.titleInvestigating the learning progress of cnns in script identification using gradient valuesen_US
dc.typeConference-Full-texten_US

Files

Collections