IMPROVED ELECTRONIC MAIL CLASSIFICATION USING HYBRIDIZED ROOT WORD EXTRACTIONS
Abstract
Content based spam filter prevents spam mail from successful delivery to the targeted host using Bayesian probability approach. Unfortunately, spammers deceived content based filters by coming up with sophisticated means of circumventing detective pattern of developed content filters, manipulating and rearranging spam mail suspicious terms/content to fool such filters, since content based spam filters only work effectively, if the suspicious terms are lexically and grammatically correct. However, this paper proposes word stemming combined with Bayesian probability approach to regain spam-free inbox in the electronic mail infrastructure. The hybridized technique was used to detect modified suspicious terms by examining the base root of the misspelled or modified manipulated suspicious words/terms and reconverting them to the correct token or near correct token and examine as such. The implementation of the algorithm when tested with direct and manipulated spam mail content was able to successfully identified spam mail with manipulated suspicious terms and 99% of the tested known manipulated suspicious terms spam mail were identified and classified as spam. However manipulated spam mail is of no effect in hybridized word stemming combined with Bayesian probability spam filter approach. The algorithm is effective, accurate, prevent false classification and negate spammer's innovation.
References
Aladdin Knowledge Systems. (2011). Anti-spam white paper.www.csisoft.com/security/ aladdin/esafe_antispam_whitepaper.pdf
Albercht. K, (2006). Mastering Spam: A Multifaceted Approach with the Spamato SpamFilter System DSS. ETH NO. 16839
Amol, G. K., Prashant, K. K. and Anil, K. G. (2013). Survey of Spam Filtering Techniques and Tools, and MapReduce with SVM. International Journal of Computer Science and Mobile Computing. 2(11): 91 – 98
Anbazhagu, U. V., Praveen J. S., Soundarapandian, R. and Manoharan N. (2014). Efficacious Spam Filtering and Detection in Social Networks. Indian Journal of Science and Technology. 7(S7): 180–184. ISSN (Print): 0974-6846, ISSN (Online): 0974-5645
Copyright (c) 2023 FUDMA JOURNAL OF SCIENCES
This work is licensed under a Creative Commons Attribution 4.0 International License.
FUDMA Journal of Sciences