PREDICTION OF DISSOLVED OXYGEN IN HARBOURS USING ARTIFICIAL NEURAL NETWORKS: AN APPLICATION TO THE PORT OF COLOMBO By W.K.C.N. DAYANTHI THIS THESIS W A S S U B M I T T E D TO THE D E P A R T M E N T OF CIVIL ENGINEERING OF THE UNIVERSITY OF M O R A T U W A IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF M A S T E R OF ENGINEERING. D E P A R T M E N T OF CIVIL ENGINEERING UNIVERSITY OF M O R A T U W A SRI L A N K A D E C E M B E R 2001. 0 7 4 4 4 3 mvtrsil* of Moratuwa 5 4 3 . 3 ; 0 0 1 74443 DECLARATION I hereby declare that the work included in the thesis in part or whole has not been submitted in any form for any other academic qualification of any institution. Miss W.K.C.N.Dayanthi M.Eng /C /01 /2000 Certified by Dr. Mahesh Jayaweera. Supervisor. ABSTRACT Maintenance of high dissolved oxygen (DO) level in harbours is highly important as it could give rise to catastrophic effects if it is depleted affecting day- to- day port functions such as dredging activities and other maintenance work. The depletion of DO results not only in toxic gases such as methane and hydrogen sulfide but also in accumulation of wastes. Frequent monitoring of DO is therefore imperative, but makes practical difficulties due to ship movements and other activities. Hence, prediction of DO with an empirical model using Artificial Neural Networks (ANNs) was done with success with an application to the Port of Colombo (PoC). This model aims to lessen the frequency of monitoring DO and to foresee the responses of the system due to environmental changes. The performances of ANNs were compared with Multiple Linear Regression (MLR). Monthly values of 14 water quality parameters at several depths for the period of four years from 1997 to year 2000 were collected. The values of weather parameters of rainfall and wind velocity for the corresponding period were also collected. The neural network possessing 7 inputs and 45 hidden neurons, performed well giving rise to correlation coefficient (R) as 0.87 and 0.67 for calibration and verification respectively. The inputs are temperature, depth and five rainfall intensities (including values on four immediate previous days). A sensitivity analysis was carried out to assess the potentials of small changes in each input on the neural network output. MLR model with 7 input variables indicated R to be 0.45 for calibration after several transformations. The temperature was the most influential variable among the ANN inputs affecting the output. In conclusion, it could be inferred that the ANN model is capable of predicting DO in PoC considerably well compared with MLR. ACKNOWLEDGEMENT First, I would like to render my cordial thanks to my supervisor, Dr. Mahesh Jayaweera for advising and guiding me in all aspects. My gratitude should also go to Professor (Mrs.) N. Rathnayake, Director of Postgraduate Studies, for supporting me in numerous ways. Further, I would like to extend my gratitude to the Asian Development Bank (ADB) for the provision of financial aids. Special thanks hereby go to the Japan Port Consultant (Pvt.) Ltd. for giving me the necessary water quality data of the inner harbour of the Port of Colombo. I am also grateful to the Meteorological Department for providing me with meteorological data. It is with grateful appreciation that I acknowledge the contribution of the Environmental Engineering Laboratory-staff in the University of Moratuwa, namely Ms. Priyanka Dissanayake, Mrs. Nilanthi Gunathilake and Mr. Justin Silva. At last, but not least my sincere thanks go to all the others who helped me in numerous ways to make this a success. TABLE OF CONTENTS Page No. List of tables i List of figures iii List of symbols vi List of abbreviations vii Chapter 1 : Introduction 1.1 Background 1-1 1.2 Application to the Port of Colombo 1 -4 1.2.1 General description of the Port of Colombo 1 -4 1.2.2 Environmental degradation in the Port of Colombo 1-5 1.2.3 Water quality parameters 1-6 1.3 Aims and Objectives 1-10 1.4 Methodology 1-10 1.5 Main findings 1-11 1.6 Presentation of chapters 1-12 Chapter 2 : Literature Review 2.1 Research carried out so far on Artificial Neural Network (ANN) in the field of ecology 2-1 Chapter 3 : Model Formulation 3.1 Multiple Linear Regression 3-1 3.1.1 Introduction 3-1 3.1.2 Multiple correlation 3-2 3.1.3 Transformation of data 3-2 3.1.4 Linear regression 3-5 3.1.5 Multiple linear regression using more > than two independent variables 3-8 3.1.6 Stepwise regression 3-9 3.2 Artificial Neural Networks 3.2.1 Introduction 3-9 3.2.2 Backpropagation techniques 3-9 3.3 Windows Neural Network ( WinNN); Software Package 3-20 Chapter 4 Materials and Methods 4.1 Data collection 4-1 4.2 Modeling Techniques 4-1 4.2.1 Introduction 4-1 4.2.2 Case Study 4-4 4.2.3 Modeling using 9 input parameters 4-6 4.2.4 Modeling using 7 input parameters 4-7 Chapter 5 : Results 5.1 Summary 5-1 5.2 Results of the case study 5-1 5.2.1 ANN-16 5-1 5.2.2 Sensitivity analysis on ANN-16 5-1 5.2.3 MLR-16 5-8 5.2.4 Sensitivity analysis on MLR-16 5-8 5.3 Modeling using 9 input parameters 5-8 5.3.1 ANN-9 5-8 5.3.2 MLR-9 5-16 5.4 Modeling using 7 input parameters 5-16 5.4.1 ANN-7 5-16 5.4.2 Sensitivity analysis on ANN-7 5-16 5.4.3 MLR-7 5-18 > Chapter 6: Discussion 6.1 Correlation between observed and estimated values 6-1 6.1.1 ANN-16 and ANN-9 6-1 6.1.2 MLR-16 and MLR-9 6-2 6.1.3 ANN-7 and MLR-7 6-3 6.2 Examination of the residuals 6-5 6.2.1 Relationship between the residuals and DO (Observed and Predicted) 6-5 6:2.2 The normality of residuals 6-7 6.3 Sensitivity analysis 6.3.1 Sensitivity analyses on ANN-16, ANN-7 and MLR-16 6-8 Chapter 7: Conclusions and Further recommendations 7.1 Conclusion and further recommendations 7-1 Bibliography Appendix A A-l Appendix B B-l Appendix C C-l LIST OF TABLES Table 3.1 Transforms for several types of curves Table 5.1 Parameters of the WinNN control panel for each model Table 5.2 Statistical Evaluation Table 6.1 Percentage variation of residuals for testing data of each model Table A-l Input data of calibration data set of ANN-16 and MLR-16 Table A-2 Input data of verification data set of ANN-16 and MLR-16 Table A-3 Input data of calibration data set of ANN-9 and MLR-9 Table A-4 Input data of verification data set of ANN-9 and MLR-9 Table A-5 Input data of calibration data set of ANN-7 and MLR-7 Table A-6 Input data of verification data set of ANN-7 and MLR-7 Table B-l Correlation coefficients of the trained neural networks using 9 input parameters Table B-2 Network characteristics of the trained neural networks using 7 input parameters Table C-l Predicted and observed values of the calibration data set of ANN-16, ANN-9 and ANN-7 Table C-2 Predicted and observed values of the verification data setof ANN-16, ANN-9 l and ANN-7 Table C-3 Predicted and observed values of the calibration data set of MLR-16, MLR-9 and MLR-7 Table C-4 Predicted and observed values of the verification data set of MLR-16, MLR-9 and MLR-7 C-8 C- l l C-18 11 LIST OF FIGURES Page No. Fig. 3.1 Confirmatory and exploratory fits 3-6 Fig. 3.2 Confirmatory and exploratory fits 3-6 Fig. 3.3 Input Matrix 3-11 Fig. 3.4 Output Matrix 3-11 Fig. 3.5 Schematic illustration of a three-layered feed-forward neural (backpropagation) network 3-13 Fig. 3.6 Network designed for basic backpropagation 3-14 Fig. 3.7 Generalized network design with time lags 3-19 Fig. 4.1 Location of sampling points 4-2 Fig. 5.1 Graph of predicted vs observed DO for ANN-16 (Training) 5-2 Fig. 5.2 Graph of predicted vs observed DO for ANN-16 (Testing) 5-2 Fig. 5.3 Sensitivity of temperature for ANN-16 model 5-3 Fig. 5.4 Sensitivity of Rain-2 for ANN-16 model 5-3 Fig. 5.5 Sensitivity of Ammonium-N for ANN-16 model 5-3 Fig. 5.6 Sensitivity of Total-N for ANN-16 model 5-4 Fig. 5.7 Sensitivity of COD for ANN-16 model 5-4 Fig. 5.8 Sensitivity of SS for ANN-16 model 5-4 Fig. 5.9 Sensitivity of Nitrate-N for ANN-16 model 5-5 Fig. 5.10 Sensitivity of Nitrite-N for ANN-16 model 5-5 Fig. 5.11 Sensitivity of Organic-N for ANN-16 model 5-5 Fig. 5.12 Sensitivity of Total-P for ANN-16 model 5-6 Fig. 5.13 Sensitivity of Wind for ANN-16 model 5-6 Fig. 5.14 Sensitivity of Rain-1 for ANN-16 model 5-6 Fig. 5.15 Sensitivity of Rain-3 for ANN-16 model 5-7 Fig. 5.16 Sensitivity of Rain-4 for ANN-16 model 5-7 Fig. 5.17 Sensitivity of Rain-5 for ANN-16 model 5-7 Fig. 5.18 Graph of predicted vs observed DO for 5-9 MLR-16 (Training) Fig. 5.19 Graph of predicted vs observed DO for iii , , „ MLR-16 (Testing) 5-9 Fig. 5.20 Sensitivity of Temperature for MLR-16 model 5-10 Fig. 5.21 Sensitivity of Ammonium-N for MLR-16 model 5-10 Fig. 5.22 Sensitivity of Rain-1 for MLR-16 model 5-10 Fig. 5.23 Sensitivity of Rain-2 for MLR-16 model 5-11 Fig. 5.24 Sensitivity of Rain-5 for MLR-16 model • 5-11 Fig. 5.25 Sensitivity of Total-P for MLR-16 model 5-11 Fig. 5-26 Sensitivity of COD for MLR-16 model 5-12 Fig. 5-27 Sensitivity of SS for MLR-16 model 5-12 Fig. 5-28 Sensitivity of Organic-N for MLR-16 model 5-12 Fig. 5-29 Sensitivity of Nitrate-N for MLR-16 model 5-13 Fig. 5-30 Sensitivity of Nitrite-N for MLR-16 model 5-13 Fig. 5-31 Sensitivity of Total-N for MLR-16 model 5-13 Fig. 5-32 Sensitivity of Wind for MLR-16 model 5-14 Fig. 5.33 Variation of correlation coefficients for training and testing data 5-14 Fig. 5.34 Graph of predicted vs observed DO for ANN-9 (Training) 5-15 Fig. 5.35 Graph of predicted vs observed DO for ANN-9 (Testing) 5-15 Fig. 5.36 Graph of predicted vs observed DO for MLR-9 (Training) 5-17 Fig. 5.37 Graph of predicted vs observed DO for MLR-9 (Testing) 5-17 Fig. 5.38 Variation of correlation coefficients for training and testing data 5-18 Fig. 5.39 Graph of predicted vs observed DO for ANN-7 (Training) 5-19 Fig. 5.40 Graph of predicted vs observed DO for ANN-7 (Testing) 5-19 Fig. 5.41 Sensitivity of Temperature for ANN-7 model 5-20 Fig. 5.42 Sensitivity of Rain-1 for ANN-7 model 5-21 Fig. 5.43 Sensitivity of Rain-2 for ANN-7 model 5-21 Fig. 5.44 Sensitivity of Rain-3 for ANN-7 model 5-21 iv Fig. 5.45 Sensitivity of Rain-4 for ANN-7 model 5-22 Fig. 5.46 Sensitivity of Rain-5 for ANN-7 model 5-22 Fig. 5.47 Graph of predicted vs observed DO for MLR-7 (Training) 5-23 Fig. 5.48 Graph of predicted vs observed DO for MLR-7 (Testing) 5-23 Fig. 5.49 Graph of predicted vs observed DO for MLR-7 (Training omitting 50 data sets) 5-24 Fig. 5.50 Graph of predicted vs observed DO for MLR-7 (Training omitting 50 data sets) 5-24 Fig. 6.1 Relationship between the residuals and the estimated and observed values of DO for training data set-(l),(2):ANN-7;(3),(4):ANN-9; (5),(6):ANN-16 6-10 Fig. 6.2 Relationship between the residuals and the estimated and observed values of DO for training data set-(l),(2):MLR-7;(3),(4):MLR-9; (5),(6):MLR-16 6-11 Fig. 6.3 Relationship between the residuals and the estimated and observed values of DO for testing data set-(l), (2): ANN-7;(3),(4):ANN-9; (5 ) , (6 ) :ANN-16 6-12 Fig. 6.4 Relationship between the residuals and the estimated and observed values of DO for testing data set-(l), (2): MLR-7;(3),(4):MLR-9; (5),(6):MLR-16 6-13 Fig. 6.5 Distribution of residuals for training data sets(l) MLR, 16 ; (2) ANN, 16 ; (3) MLR, 9 ; (4) ANN, 9 (5) MLR, 7 (Transformed);(6) ANN,7 6-14 Fig. 6.6 Distribution of residuals for testing data sets(l) MLR, 16 ; (2) ANN. 16 ; (3) MLR, 9 ; (4) ANN, 9 (5) MLR, 7 (Transformed);(6) ANN,7 6-15 v LIST OF SYMBOLS Name Momentum Learning parameter Change in Y for a unit change in X2 keeping Xi constant Effect of X'{ on Y' keeping X\ constant. Change in Y for a unit change in X2 keeping X/ constant Effect of X\ on Y[ keeping X\ constant Number of pairs. Correlation coefficient Determination coefficient Correlation of X] with Y Correlation of X2 with Y Correlation of Xj with X2 Partial correlation of J^Gand Y with Xj Controlled. The Slope between low summary point and the middle of the curve. The Slope between middle summary point and the high summary point Standard deviation of X/, Standard deviation of X2. Standard deviation of Y data. vi Standard deviation of Xj data. Standard deviation of Y' data. Standard deviation of X' data. Dependent variable. Dependent variable. Standardized version of X. A'coordinate of the high summary point X coordinate of the low summary point A'coordinate of the middle summary point Sample mean Independent variable Standardized version of Y Sample mean Y coordinate of the high summary point Y coordinate of the low summary point. Y axis value of middle summary point. vn LIST O F ABBREVIATIONS Abbreviation Name Ammonium-N Ammonium Nitrogen ANN Artificial Neural Network ANN-16 ANN model with 16 inputs ANN-9 ANN model with 9 inputs ANN-7 ANN model with 7 inputs BPN Backpropagation Technique BQ Bandaranayake Quay CH 4 Methane COD Chemical Oxygen Demand DO Dissolved Oxygen H 2 S Hydrogen Sulphide JCT Jaya Container Terminal M Mean MLR Multiple Linear Regression MLR-16 MLR model with 16 inputs MLR-9 MLR model with 9 inputs MLR-7 MLR model with 7 inputs Nitrate-N Nitrate Nitrogen Nitrite-N Nitrite Nitrogen Organic-N Organic Nitrogen ORP Oxygen Reduction Potential PoC Port of Colombo PVQ Prince Vijaya Quay QEQ Queen Elisabeth Quay Rain-1 Rainfall intensity on the 1 s t day Rain-1 Rainfall intensity on the 2 n d immediate previous day I* r Vlll Rain-1 Rainfall intensity on the 3 r d immediate previous day Rain-1 Rainfall intensity on the 4 t h immediate previous day Rain-1 Rainfall intensity on the 5 t h immediate previous day RMS Root Mean Square SLPA Sri Lanka Ports Authority SD Standard Deviation SS - Suspended Solids Total-N Total Nitrogen Total-P Total Phosphorous WinNN Windows Neural network i x