CONTENT-BASED IMAGE RETRIEVAL USING LARGE CENTRE REGIONS — t lW ,n (WlwV , Raj inda S u r e s h S e n a r a t n e A Thesis submitted to the Department of Electronic and Telecommunicat ion Engineer ing at the University of Moratuwa in partial fulfilment of the requirements for the Degree of Master of Engineering. U n i v e r s i t y o f M o r a t u w a L i s * 82721 8 2 7 2 1 8 2 7 2 I February 2004 CONTENT-BASED IMAGE RETRIEVAL USING LARGE CENTRE REGIONS Submitted by Rajinda Suresh Senaratne Examining Committee Dr. G.D.S.P. Wimalaratne (Chairperson) Dr. A.A. Pasqual Dr. R.M.A.P. Rajatheva This Research Project was carried out at the Department of Electronic and Telecommunication Engineering of the University of Moratuwa during the period from August 2002 to December 2003. February 2004 DECLARATION The work presented in this dissertation has not been submitted for the fulfilment of any other degree. BenSratne ' Dr. A.A. Pasqual Candidate Supervisor ACKNOWLEDGEMENTS / take this opportunity to convey my deep and sincere thanks to those who gave me tremendous assistance and co-operation to complete the Research Project and Thesis successfully. First I would like to express my deep and sine ere thanks to my Project Supervisor, Dr. Ajith A. Pasqual, for providing continuous guidance, advice, constructive suggestions and invaluable support throughout the project. He allowed me to choose this topic freely without setting any restrictions and consistently helped me and directed me in organizing and carrying out the work. I wish to express my gratitude and appreciation towards him for spending his valuable time in assisting me to make this project a success. I would also like to thank the Chairman of my examining committee, Dr. Prasad Wimalaratne, for his kind advice and invaluable suggestions given. I am grateful to Dr. Dileeka Diasfor advising me and providing me the opportunity to enrol for this course. My gratitude goes to Dr. R. P. Thilakumara, the Course Co-ordinator of PC Dip/MEng, for his kind advice and assistance given. I thank all the academic and non-academic staff members for their assistance and support given. Finally I would like to thank my parents, friends and colleagues for their encouragement, support and co-operation given, and all others who helped me to make this project a success. Rajinda Senaratne University of Moratuwa February 2004 ABSTRACT Among all the visual features used for content-based image retrieval, colour is perhaps the most dominant and dist inguishing one in many applications. Therefore in this research project, the concentration was focused on the colour property of images. In this work, a new histogram refinement technique, Large Centre Regions ( L C R ) Refinement, and a new region representation technique, LCR Sets , based on colour regions are presented. These methods extract a selected number of largest regions around the centre of the image and match other images emphasizing this property. T w o assumptions are made. First is, that it can be assumed that the significant objects or items of an image are often located at the centre. These objects can often be characterized by their colour. Hence an image retrieval technique which extracts the colours of large centre regions of an image would improve the retrieval performance for images with significant objects at the centre. The second is, that the techniques were tested on an image database predominantly consist ing of red images, but they perform similarly for other colours as well . The presented histogram refinement descriptor, Large-Centre-Regions Vector, effectively represents large centre regions of an image. In addition to this, LCR Sets represent basic information about the shape of a region. In the prototype, firstly, all the regions in an image were extracted depending on the similarity of the colour of the pixels. A centre zone was defined on the image and a selected number of largest regions which overlap with this centre zone at least by 5 0 % of the region area were selected as the Large-Centre-Regions for histogram refinement basis. In addition to large centre regions, LCR Sets represent the areas of a selected umber of largest regions lying outside the centre zone and the width to height ratio of the minimum bounding rectangle of each region. Since the largest regions at the centre are given the emphasis for matching, effect of the background can be minimized as well because most part of the background often lies outside the centre zone. Extra dist inguishing capability among different images can be achieved with LCR Sets. Experimental results of LCR Refinement show much improved retrieval performance, especially for images with significant regions at the centre. Results show 2 0 % average improvement in ranks with LCR Refinement compared to Histogram. By combining LCR Sets with either Histogram or LCR Refinement, this can be further improved upto 2 6 % or 2 2 % , respectively. TABLE OF CONTENTS Title Page i Declaration ii Acknowledgements iii Abstract iv Table of Contents v List of Figures viii List of Tables ix 1.0 INTRODUCTION 1 1.1 Motivation 1 1.2 Deficiencies of Existing Methods 2 1.3 Contributions 3 1.4 The Structure of the Thesis 4 2.0 BACKGROUND AND RELATED WORK 5 2.1 Introduction to CBIR 5 2.1.1 Feature Extraction 6 2.1.2 Feature Integration and Indexing 7 2.1.3 Interactive CBIR 8 2.2 Colour Histogram 8 2.2.1 Similarity Measurements 8 2.2.2 The Capacity of Colour Histogram Indexing 9 2.2.3 Illumination Invariant Descriptors 9 2.3 Colour Moments 9 2.4 Partially Overlapping Fuzzy Regions 10 2.5 Colour Sets 12 2.6 Histogram Refinement 14 2.6.1 Colour Coherence Vector 14 2.6.2 Centering Refinement 15 2.6.3 Successive Refinement 15 2.7 Colour Correlograms 16 2.8 Summary 16 3.0 LARGE CENTRE REGIONS REFINEMENT 18 3.1 Image Database 20 3.2 Conversion to HSV Space 20 3.3 Quantization 21 3.4 Region Extraction 22 3.5 Classification of Large Centre Regions 22 3.6 Comparison of LCR-vectors 24 3.7 Performance Analysis 24 3.8 Summary 25 4.0 LARGE CENTRE REGIONS SETS 26 4.1 Classification of Regions 26 4.2 Comparison of LCR Sets 27 4.2.1 Matching by Area Only 27 4.2.2 Matching by WHR Only 28 4.2.3 Matching by Combination of Both Area and WHR 28 4.3 Combining LCR Sets with Histogram 29 4.3.1 Matching by Area Only 29 4.3.2 Matching by WHR Only 29 4.3.3 Matching by Combination of Both Area and WHR 29 4.4 Combining LCR Sets with LCR Refinement 29 4.4.1 Matching by Area Only 30 4.4.2 Matching by WHR Only 30 4.4.3 Matching by Combination of Both Area and WHR 30 4.5 Summary _ 30 5.0 OTHER DESCRIPTORS USED FOR COMPARISON 31 5.1 Colour Histogram 31 5.2 Colour Coherence Vector 31 5.3 Histogram Centering Refinement 32 5.4 Successive Refinement 32 5.5 Summary 33 6.0 EXPERIMENTAL RESULTS OF LCR REFINEMENT 34 6.1 Maximum Precision for 100% Recall 34 6.2 Recall vs. Precision Plots 36 6.3 Comparison by Rank 44 6.4 Analysis of Results 48 6.5 Summary 50 7.0 EXPERIMENTAL RESULTS OF LCR SETS 51 7.1 Comparison by Rank 51 7.2 Recall vs. Precision Plots 57 7.3 Analysis of Results 60 7.4 Summary 61 8.0 CONCLUSIONS 62 8.1 Conclusions 62 8.2 Recommendations and Future Work 63 Bibliography 65 Appendix A Appendix B Appendix C 67 69 71 LIST OF FIGURES 2.1 An Image Retrieval System Architecture 2.2 Fuzzy Regions 3.1 Transformation from RGB to HSV space 3.2 Centre Zone of an Image 6.1 Maximum Precision vs. Query Image No. for 100% recall. 6.2 Images Retrieved by LCR-Refinement with Weight Ratio 70:25:5 6.3 Recall vs. Precision for query image no. 80 6.4 Recall vs. Precision for query image no. 99 6.5 Recall vs. Precision for query image no. 285 6.6 Recall vs. Precision for query image no. 290 6.7 Recall vs. Precision for query image no. 344 6.8 Recall vs. Precision for query image no. 1 6.9 Recall vs. Precision for query image no. 27 6.10 Recall vs. Precision for query image no. 36 6.11 Recall vs. Precision for query image no. 258 6.12 Recall vs. Precision for query image no. 263 6.13 Recall vs. Precision for query image no. 280 6.14 Recall vs. Precision for query image no. 309 6.15 Recall vs. Precision for query image no. 334 6.16 Pixels Contributing to Descriptors for Images 80, 99, 285, 290 & 344 6.17 Portions of Background Contributing to Descriptors 6.18 Images that can be avoided by LCR Refinement 6.19 An Irrelevant Image for query image no. 290 7.1 Recall vs. Precision for query image no. 80 7.2 Recall vs. Precision for query image no. 99 7.3 Recall vs. Precision for query image no. 285 7.4 Recall vs. Precision for query image no. 290 7.5 Recall vs. Precision for query image no. 344 7.6 Images 99, 108 & 304 and their Large Centre Regions B.l Similar Image Sets (Image number is shown on top.) C.l GUI for Entering Queries C.2 GUI for Entering Weights and Thresholds LIST OF TABLES 6.1 Ranks of Relevant Images of Query Image no. 80 6.2 Ranks of Relevant Images of Query Image no. 99 6.3 Ranks of Relevant Images of Query Image no. 285 6.4 Ranks of Relevant Images of Query Image no. 290 6.5 Ranks of Relevant Images of Query Image no. 344 6.6 Maximum Ranks of Relevant Images of Image Sets 1-5 6.7 Maximum Ranks of Relevant Images of Image Sets 6-13 7.1 Ranks of Relevant Images of Query Image no. 80 7.2 Ranks of Relevant Images of Query Image no. 99 7.3 Ranks of Relevant Images of Query Image no. 285 7.4 Ranks of Relevant Images of Query Image no. 290 7.5 Ranks of Relevant Images of Query Image no. 344 7.6 Ranks of Relevant Images of Query Image no. 80 7.7 Ranks of Relevant Images of Query Image no. 99 7.8 Ranks of Relevant Images of Query Image no. 285 7.9 Ranks of Relevant Images of Query Image no. 290 7.10 Ranks of Relevant Images of Query Image no. 344 7.11 Maximum Ranks of Relevant Images 4