FACE RECOGNITION USING KERNEL CLASSIFIERS MSC IN COMPUTER SCIENCE K. A. D. N. K. WIMALAWARNE UNIVERSITY OF MORATUWA JANUARY 2008 FACE RECOGNITION USING KERNEL CLASSIFIERS K. A. D. N. K. WIMALAWARNE This Dissertation was Submitted to the Department of Computer Science and Engineering of the University of Moratuwa in Partial Fulfillment of the requirements for the Degree of MSc in Computer Science. Department of Computer Science and Engineering University of Moratuwa January 2008 i To my parents ii Declaration I, K. A. D. N. K. Wimalawarne hereby declare that the work included in this dissertation in part or whole has not been submitted for any other academic qualification at any institution. K. A. D. N. K. Wimalawarne Dr. Chathura De Silva Supervisor iii Abstract Face recognitions remains to be one of the biggest challenges to the machine learning community. Over three decades of extensive research has been carried out in this field by many researchers. In spite of many face recognition methods developed, research on novel methods are needed to fulfill needs of modern applications. In the recent past kernel methods have been successfully applied to face recognition. We present a novel approach in face recognition with informative vector machine, a sparse Gaussian process kernel classifier. Experiments with the ORL face database shows that recognition accuracies of both these algorithms to be comparable. But informative vector machine has the ability to provide more sparse solutions than support vector machines. We also found that using automatic relevance determination kernels which with informative vector machine provides a novel approach to dimension reduction in feature space. Overall, both sparse solutions and dimension reductions with informative vector machine reduces the storage space and computational cost while achieving a recognition accuracy close to support vector machines. Keywords : Face recognition, Gaussian process, kernel classifier, informative vector machines, sparse, support vector machines iv Acknowledgement First of all I would like to thank my research supervisor Dr. Chatura De Silva for taking me as a student and for providing me guidance thought my research. Also I would like to thank the MSc course supervisor Prof. Gihan Dias and the head of the department Mrs Vishaka Nanayakkara for accepting me for the MSc course in Computer Science and providing financial support during my work. A lot of gratitude has to be made for all the lecturers both permanent and visiting whose courses I followed during the MSc program. I would like to thank Dr. Chulantha Kulasekara, Dr. Lanka Udawatta and Dr. Ajantha Athukorala for their support as thesis committee members. During my work at the department I received lot of assistance from both technical and non academic staff. I would like to thank them for their kind support. Also I want to thank all the colleagues in my MSc class and office whose help and friendship were invaluable. Finally I wish to thank my parents for their support that they gave me all throughout my life. v Table of Content List of Figures………………………………………………………………………. vii List of Tables………………………………………………………………………... viii Chapter 1 : Introduction…………………………………………………………… 1 1.1 Applications of Face Recognition………………………………………………… 1 1.2 Methodology……………………………………………………………………… 2 1.3 Objectives…………………………………………………………………………. 2 1.4 Structure of the Thesis……………………………………………………………. 3 Chapter 2 : Literature Survey….….…………………………..….…..…………… 4 2.1 Methods of Face Recognition………………………………….…...……..…….... 4 2.2 Popular Face Recognition Algorithms……………………….…...………..….….. 5 2.2.1 Principal Component Analysis (PCA)………………..……….……..….… 5 2.2.2 Ficherface.…………………………………….…………..….…………… 6 2.2.3 Neural Networks……………………………..….………………………… 7 2.2.4 Support Vector Machines (SVM) …………….……………..…….……… 8 2.2.5 Gaussian Processes………………………….……………..……………… 8 2.3 Informative Vector Machines (IVM)…………….…………………..…………… 9 2.4 Summary……………………………………………………………………...…… 9 Chapter 3 : Kernel Methods…………………………..…………………………… 11 3.1 Kernel Transforms………………………………………………………..….….… 11 3.2 Support Vector Machines……………………………….…………………..…..… 12 3.3 Gaussian Process……………………………………………….…...……..……… 14 3.4 Summary…………………………………………………………………..….…… 16 Chapter 4 : Proposed method……………………………………………………… 17 4.1 Introduction……………………………………………….…………...……..…… 17 4.2 Assumed Density Filtering (ADF) Approximation………………...….……..…… 18 4.3 Binary Classification…………………………………………….….…..………… 20 4.4 Data Point Selection………………………………………………….…………… 21 4.5 Kernel Parameter Updates………………………….………………..….………… 23 4.6 Noise Parameter Updates……………………………………….....……………… 23 4.7 Optimization Strategy…………………………………………………….……….. 24 4.7.1 Common data points and Expectation Propagation…………………..…… 25 4.7.2 Reducing Randomness……………………………………….…………… 26 4.8 Stopping Condition………………………………………..….…………………… 27 4.9 Kernel Functions…………………………………………….……………..……… 28 4.9.1 Linear Kernels………………………...…………….……………..……… 28 4.9.2 Radial Basis Function (RBF) Kernels………………………..…………… 28 4.9.3 Multi-layer Perceptron (MLP) Kernel……………..……………………… 28 vi 4.9.4 Choosing Between Kernels……………...……….…..…………………… 29 4.10 Method of Prediction………………………………….………………….……… 29 4.11 Summary………………………………………….………………… ……..…… 30 Chapter 5 : Experiments……………….…………………………………………... 31 5.1 Face databases………………………………………………………..…………… 31 5.1.1 ORL Database……………..………………………………………….…… 31 5.2 Experimental Design……………….…………………………..……………….… 32 5.2.1 Experimental Setup………..…………………….……………………….… 32 5.2.2 Preprocessing of Images…………………..……….…………………….… 32 5.3 Experiment 1………………………..…………………….……………………..… 33 5.4 Experiment 2……………………..……………………….………………….......... 33 5.5 Summary………………..……………………………….………………................ 33 Chapter 6 : Results and Analysis…………………….……………..…………....... 34 6.1 Experiment 1…………………….…………………………………………........... 34 6.2 Experiment 2………………………………………………….…………............... 39 6.3 Analysis…………………………………………………………………..……..… 42 6.3.1 Accuracy………………………………………………………………....… 42 6.3.2 Sparsity……………………………………………………..…………….... 42 6.3.3 Dimension Reduction………………………………………………............ 42 6.3.4 Training Times…………………………………………………..……........ 43 6.3.5 Classification Times……………………………….…..…………………... 43 6.3.5 Storage Capacity……………………………………………………............ 43 6.4 Distribution of ARD Values………………………………………………………. 44 6.5 Summary…………………………………………………………………..………. 50 Chapter 7 : Conclusion and Future Work……………………………………....... 52 7.1 Research Achievements……………………………..……………………………. 52 7.2 Future Research……………………………………..…………………………….. 53 Appendix A.…………………………………………………………..……….......... 55 Appendix B..……………………………………………………………………........ 57 Appendix C…….……………………………..…………………………………...... 59 Appendix D.…………………………………………………..…………………...... 61 References..…………………………………………………………………..……... 64 vii List of Figures 2.1 Schematic diagram of face recognition methodologies……………….…………………. 4 2.2 Basic model of an artificial neural network…………………………….………..………. 7 3.1 Kernel transformation of data space to a feature space………………….………..……… 11 3.2 Binary classification with support vector machines……………………..……..… 12 3.2 A graphical representation of the Gaussian process model as in [21]. This uses the plate notation to indicate the independent relationship between f and y………….…………..….…… 14 4.1 Learning decision boundary in IVM…………………………………………….……….. 17 4.2 Iterative probability approximation procedure of ADF…………………………..………. 19 5.1 A sample of faces of ORL database…………………………………………….….……... 32 6.1 Classification error (Experiment 1) ……….…………….……………….………..….….. 35 6.2 Classification times in experiment 1.………..……………..…….………………………. 36 6.3 Training times of classifiers in experiment 1.…..………..……………………….……… 36 6.4 Distribution of ARD values……………………………..….….……………….………… 37 6.5 Storage requirements of classifiers in experiment 1.…….…....………………….………. 38 6.6 Classification errors in experiment 2..……………………………………..……………. 40 6.7 Classification times of experiment 2.……………………..…..………..………………… 41 6.8 Training times of classifiers of experiment 2.….…….…………………….…………….. 42 6.9 Distribution of ARD values of subject no. 1 in ORL face database……………………… 44 6.10 Distribution of ARD values of subject no. 10 in ORL face database…….……..………. 45 6.11 Distribution of ARD values of subject no. 20 in ORL face database.…………………... 46 6.12 Distribution of ARD values of subject no. 22 in ORL face database…….………..……. 47 6.13 Distribution of ARD values of subject no. 30 in ORL face database……….…..………. 48 6.14 Distribution of ARD values of subject no. 37 in ORL face database………….…..……. 49 viii List of Tables 3.1 Most commonly used kernel functions……………………………………………..……… 12 6.1 Results of experiment 1…………………………………………………………….…,…... 34 6.2 Storage requirements for classifiers in experiment 1………………………………….…... 38 6.3 Results of experiment 2……………………………………………………………..……... 39 6.4 Storage requirements of classifiers in experiment 2……………………………………..… 42