Abstract:
Data mining is a subset of databases management and it mainly applicable to large and complex databases to eliminate the randomness and discover the hidden pattern. Fraud detection in data mining is the process of identifying fraudulent acts by analyzing the dataset. Research is based on identifying fraudulent acts of water bottle delivery process. The research study focusses on to manage the invoicing process with the water delivery process. Due inefficacies in the water delivering process bottle lost cost in the last six months is Rs 213,070.00 approx. Through detecting fraudulent acts, the institutes can save resources and cost [3], for this study a sample data set has been used to identify how the fraudulent activities are occurring. Sample dataset has been selected from where data entry person had found physical evidence that the bottle had been sold for outsiders.
Data mining tools which used to detect frauds are Naïve Bayes, Decision Trees, and neural networks. By developing predictive models can be generated to estimate things such as the probability of fraudulent behavior. ROC curves have deployed for model assessment to provide a more intuitive analysis of the models and confusion matrix is has used to describe the performance of a classification model on the test data for which the true values are known.