Abstract:
My project is an implementation of 'Naive Based Algorithm' to classify E-mails. The main idea of the project is to implement algorithm in a effective manner to classify bank E-mails in bank internal authenticated mail system. E- mail has the potential to improve efficiency and reduce costs involved in communication. Even after the advent of newer technologies such as instant messaging and VoIP, email remains the number one application for business communication.
With the increasing of information on the internet based communication, our bank internal authentication mail system needs an efficient tool to classify the E-mails into categories. In this way, we can easily classify E-mail from large amount of E-mails available. Automated text categorization is a process that assigning pre-defined category labels to E-mail based on the contents.
Text categorization has many applications. For example, we can classify web pages into different categories to speed up the internet search, which is very useful for some search engines like Yahoo, Google etc. Also E-mail service providers are using those classification techniques for spam filtering and E-mail classification as well.
I have trained the develop algorithm from more than thousand pre - categorized E-mails. I have tested the text categorization algorithm developed based on naive based classifier with several different size data sets. Accuracy is evaluated as well. Experiment results shows my conclusion is efficient.