BAYESIAN-OPTIMIZED ENSEMBLE SUPPORT VECTOR MACHINE MODEL FOR PHISHING EMAIL DETECTION

Authors

  • Igba Aji L.
  • Idris Ismaila
  • Subairu Sikiru. O.
  • Noel Moses. D.
  • Ahmed Sulemam

DOI:

https://doi.org/10.33003/fjs-2025-0912-4356

Keywords:

Phishing Detection, Ensemble Learning, Bayesian Optimization

Abstract

With the rapid growth of email use, phishing and malware attacks have become more frequent and sophisticated, often slipping past traditional defenses such as blacklists and rule-based filters. Existing detection models, including SVM, XGBoost, and CNN, have improved accuracy but still depend heavily on manually crafted features and struggle to adapt to new or evolving attack patterns. This challenge creates the need for a more flexible and intelligent detection approach capable of learning and adapting to emerging email threats. This study aims to develop an ensemble phishing email detection model combining SVM and XGBoost, optimize it using Bayesian tuning, and evaluate its performance through accuracy, precision, recall, F1-score, and ROC-AUC metrics. This study used an ensemble approach that combines SVM and XGBoost to detect phishing emails. Various SVM models, including Baseline, Grid Search, SGD, and Bayesian-optimized versions, were developed and tested. An optimized Bayesian model was developed to improve accuracy, with performance evaluated using accuracy, precision, recall, F1-score, and ROC-AUC. A well-known Kaggle phishing dataset was used for fair comparison. After cleaning and reducing 10,000 emails with 1,250 features to 9,872 emails and 500 cleaned features, the Baseline SVM reached 0.9287 accuracy, Grid Search SVM improved to 0.96, and SGD SVM slightly dropped to 0.92. The Bayesian SVM performed best at 0.9667, showing greater stability and generalization. The Bayesian-optimized Hybrid Ensemble SVM–XGBoost achieved 0.992 accuracy and 0.9992 ROC-AUC, confirming its strong reliability and effectiveness in phishing detection. Stacking substantially enhanced model stability, generalization, and real-time reliability

 

 

References

Abdillah, R., Shukur, Z., Mohd, M., & Murah, T. M. Z. (2022). Phishing Classification Techniques: A

Systematic Literature Review. In IEEE Access (Vol. 10, pp. 41574–41591). Institute of Electrical

and Electronics Engineers Inc. https://doi.org/10.1109/ACCESS.2022.3166474

Alam, S., Jameel, A., Parveen, Z., & Alnfrawy, E. (2025). Date of publication xxxx 00, 0000, date of

current version xxxx 00, 0000. SHRED: An Ensemble-Based Machine Learning Model to Sift

Email Messages for Real-Time Spam Detection. https://doi.org/10.1109/ACCESS.2025.DOI

Anirudh, S., Radha Nishant, P., Baitha, S., & Dinesh Kumar, K. (2024). An Ensemble Classification

Model for Phishing Mail Detection. Procedia Computer Science, 233, 970–978.

https://doi.org/10.1016/j.procs.2024.03.286

Birthriya, S. K., Ahlawat, P., & Jain, A. K. (2025). Phishing Website Detection with XGBoost and

Adaptive Hyperparameter Optimization using the Bat Algorithm. Procedia Computer Science,

258, 1774–1782. https://doi.org/10.1016/j.procs.2025.04.429

Butt, U. A., Amin, R., Aldabbas, H., Mohan, S., Alouffi, B., & Ahmadian, A. (2023). Cloud-based

email phishing attack using machine and deep learning algorithm. Complex and Intelligent

Systems, 9(3), 3043–3070. https://doi.org/10.1007/s40747-022-00760-3

Chinta, P. C. R., Moore, C. S., Karaka, L. M., Sakuru, M., Bodepudi, V., & Maka, S. R. (2025).

Building an Intelligent Phishing Email Detection System Using Machine Learning and Feature

Engineering. European Journal of Applied Science, Engineering and Technology, 3(2), 41–54.

https://doi.org/10.59324/ejaset.2025.3(2).04

Fares, H., Kilani, J., Fagroud, F. E., Toumi, H., Lakrami, F., Baddi, Y., & Aknin, N. (2024). Machine

Learning Approach for Email Phishing Detection. Procedia Computer Science, 251, 746–751.

https://doi.org/10.1016/j.procs.2024.11.179

Ibrahim R. B, M. S. A. I. M. U. (2023). Development of an Ensemble Classification Model Based On

Hybrid Filter-Wrapper Feature Selection For Email Phshing Detection.

Iqbal, A., Younas, M., Iftikhar, S., Fatima, F., & Saleem, R. (2025). Spam detection using hybrid

model on fusion of spammer behavior and linguistics features. Egyptian Informatics Journal, 29.

https://doi.org/10.1016/j.eij.2024.100605

Kalabarige, L. R., Rao, R. S., Abraham, A., & Gabralla, L. A. (2022). Multilayer Stacked Ensemble

Learning Model to Detect Phishing Websites. IEEE Access, 10, 79543–79552.

https://doi.org/10.1109/ACCESS.2022.3194672

Kiseki, D. W., Havyarimana, V., Zabagunda, D. L., Wail, W. I., & Niyonsaba, T. (2024). Artificial

Intelligence in Cybersecurity to Detect Phishing. Journal of Computer and Communications,

12(12), 91–115. https://doi.org/10.4236/jcc.2024.1212007

Ntayagabiri, J. P., Bentaleb, Y., Ndikumagenge, J., & El Makhtoum, H. (2025). OMIC: A Bagging-

Based Ensemble Learning Framework for Large-Scale IoT Intrusion Detection. Journal of

Future Artificial Intelligence and Technologies, 1(4), 401–416.

https://doi.org/10.62411/faith.3048-3719-63

Qiqieh, I., Alzubi, O., Alzubi, J., Sreedhar, K. C., & Al-Zoubi, A. M. (2024). An intelligent cyber

threat detection: A swarm-optimized machine learning approach. Alexandria Engineering

Journal. https://doi.org/10.1016/j.aej.2024.12.039

Ramesh, K., & Hafeez, K. (2024). Phishing Detection and Mitigation: A Cybersecurity and Machine

Learning Approach MSc Research Project MSc Cyber Security.

Rashid, M. U., Qureshi, S., Abid, A., Alqahtany, S. S., Alqazzaz, A., ul Hassan, M., Al Reshan, M. S.,

& Shaikh, A. (2025). Hybrid Android Malware Detection and Classification Using Deep Neural

Networks. International Journal of Computational Intelligence Systems, 18(1).

https://doi.org/10.1007/s44196-025-00783-x

Sankaine, L., Ndia, J. G., & Kaburu, D. (2025). An English-Swahili Email Spam Detection Model for

Improved Accuracy Using Convolutional Neural Networks. Mesopotamian Journal of

CyberSecurity, 5(2), 590–605. https://doi.org/10.58496/MJCS/2025/036

Saravana Kumar, S. (2022). Adaptive Ensemble Learning Framework For Evolving Social

Engineering Threats. www.ijesat.com

Tusher, E. H., Ismail, M. A., Rahman, M. A., Alenezi, A. H., & Uddin, M. (2024). Email Spam: A

Comprehensive Review of Optimize Detection Methods, Challenges, and Open Research

Problems. IEEE Access, 12, 143627–143657. https://doi.org/10.1109/ACCESS.2024.3467996

van Geest, R. J., Cascavilla, G., Hulstijn, J., & Zannone, N. (2024). The applicability of a hybrid

framework for automated phishing detection. Computers and Security, 139.

https://doi.org/10.1016/j.cose.2024.103736

Research Process Framework

Downloads

Published

31-12-2025

How to Cite

Aji L., I., Ismaila, I., Sikiru. O., S., Moses. D., N., & Sulemam, A. (2025). BAYESIAN-OPTIMIZED ENSEMBLE SUPPORT VECTOR MACHINE MODEL FOR PHISHING EMAIL DETECTION. FUDMA JOURNAL OF SCIENCES, 9(12), 837-842. https://doi.org/10.33003/fjs-2025-0912-4356

Most read articles by the same author(s)