IMPROVED ELECTRONIC MAIL CLASSIFICATION USING HYBRIDIZED ROOT WORD EXTRACTIONS

  • A. O. Okunade
Keywords: Spam, Ham, Email, Suspicious terms, Stemming, Filter, Spammer

Abstract

Content based spam filter prevents spam mail from successful delivery to the targeted host using Bayesian probability approach. Unfortunately, spammers deceived content based filters by coming up with sophisticated means of circumventing detective pattern of developed content filters, manipulating and rearranging spam mail suspicious terms/content to fool such filters, since content based spam filters only work effectively, if the suspicious terms are lexically and grammatically correct. However, this paper proposes word stemming combined with Bayesian probability approach to regain spam-free inbox in the electronic mail infrastructure. The hybridized technique was used to detect modified suspicious terms by examining the base root of the misspelled or modified manipulated suspicious words/terms and reconverting them to the correct token or near correct token and examine as such. The implementation of the algorithm when tested with direct and manipulated spam mail content was able to successfully identified spam mail with manipulated suspicious terms and 99% of the tested  known manipulated suspicious terms spam mail were identified and classified as spam. However manipulated spam mail is of no effect in hybridized word stemming combined with Bayesian probability spam filter approach. The algorithm is effective, accurate, prevent false classification and negate spammer's innovation.

References

Aladdin Knowledge Systems. (2011). Anti-spam white paper.www.csisoft.com/security/ aladdin/esafe_antispam_whitepaper.pdf

Albercht. K, (2006). Mastering Spam: A Multifaceted Approach with the Spamato SpamFilter System DSS. ETH NO. 16839

Amol, G. K., Prashant, K. K. and Anil, K. G. (2013). Survey of Spam Filtering Techniques and Tools, and MapReduce with SVM. International Journal of Computer Science and Mobile Computing. 2(11): 91 – 98

Anbazhagu, U. V., Praveen J. S., Soundarapandian, R. and Manoharan N. (2014). Efficacious Spam Filtering and Detection in Social Networks. Indian Journal of Science and Technology. 7(S7): 180–184. ISSN (Print): 0974-6846, ISSN (Online): 0974-5645

Published
2023-03-30
How to Cite
OkunadeA. O. (2023). IMPROVED ELECTRONIC MAIL CLASSIFICATION USING HYBRIDIZED ROOT WORD EXTRACTIONS. FUDMA JOURNAL OF SCIENCES, 3(1), 56 - 64. Retrieved from https://fjs.fudutsinma.edu.ng/index.php/fjs/article/view/1427