Institutional-Repository, University of Moratuwa.  

Efficient depiction of video for semantic retrieval applications by dimensionality reduction of visual feature space

Show simple item record

dc.contributor.advisor Ranathunga L
dc.contributor.advisor Abdullah N A
dc.contributor.author Bandara AMRR
dc.date.accessioned 2021
dc.date.available 2021
dc.date.issued 2021
dc.identifier.citation Bandara, A.M.R.R. (2021). Efficient depiction of video for semantic retrieval applications by dimensionality reduction of visual feature space [Doctoral dissertation, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.uom.lk/handle/123/21175
dc.identifier.uri http://dl.lib.uom.lk/handle/123/21175
dc.description.abstract The retrieval of temporal digital visual data, either by a text or visual query, requires automatic interpretation, which includes high-level annotation by object detection and recognition for text query-based retrieval and low-level abstraction for visual querybased retrieval. Both the accuracy and the speed of the interpretation become crucial factors in real-world applications, due to the high density of visual data. This study has focused on reducing the complexity of visual data efficiently by dimensionality reduction techniques for the detection and recognition of objects in videos for both textual annotation and visual query-based video frame retrieval. The contribution of the study includes three approaches, i.e., a novel visual feature descriptor based on colour dithering – namely Salient Dither Pattern Feature (SDPF), novel object segmentation method based on the proposed feature descriptor – namely Refining Superpixel and Histogram of oriented optical flow Clustering (RSHC) –, and a novel self-supervised local descriptor – namely Network-in-Network with Restricted Boltzmann Machine (NIN-RBM). The experimental results make it evident that the SDPF is rotation and scale invariant and computationally efficient yet shows similar object recognition accuracy to the state-of-the-art methods with minimum supervision. The results further revealed that RSHC has successfully utilized SDPF for accurately segmenting individual objects by using a very shallow history of motion. Furthermore, according to the results, NIN-RBM has shown the state-of-the-art correspondence matching performance over the existing deep-learned self-supervised binary descriptors, keeping the computation time at the minimum. The overall results support the conclusions that RSHC is capable of accurately segment objects in a video, and then SDPF can be successfully used for recognizing the segmented objects. Moreover, NIN-RBM can be used to reliably and rapidly retrieve video frames related to any visual query. Since NIN-RBM is a local descriptor, it can be further used for locating of high-level objects and estimating their poses precisely, to improve the details of semantics retrieved from video data. en_US
dc.language.iso en en_US
dc.subject DIMENSIONALITY REDUCTION en_US
dc.subject BINARY DESCRIPTOR en_US
dc.subject CORRESPONDENCE MATCHING en_US
dc.subject OBJECT RECOGNITION en_US
dc.subject VIDEO SEGMENTATION en_US
dc.subject COLOUR DITHERING en_US
dc.subject DEEP LEARNING en_US
dc.subject INFORMATION TECHNOLOGY -Dissertation en_US
dc.subject COMPUTER SCIENCE -Dissertation en_US
dc.title Efficient depiction of video for semantic retrieval applications by dimensionality reduction of visual feature space en_US
dc.type Thesis-Abstract en_US
dc.identifier.faculty IT en_US
dc.identifier.degree Doctor of Philosophy en_US
dc.identifier.department Department of Information Technology en_US
dc.date.accept 2021
dc.identifier.accno TH5063 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record