COMPARATIVE PERFORMANCE OF MACHINE LEARNING MODELS FOR CREDIT CARD FRAUD DETECTION ON IMBALANCED DATA: A STUDY USING SMOTE AND THE KAGGLE EUROPEAN DATASET

Authors

  • Opeyemi Femi Ojo Federal University Lokoja image/svg+xml
  • Paul Olawale Otaru Federal University Lokoja, Kogi State
  • Eunice O.Job Federal University Lokoja, Kogi State
  • Benson Onoghojobi Federal University Lokoja image/svg+xml

DOI:

https://doi.org/10.33003/fjs-2026-1003-4745

Keywords:

Credit card, fraud, machine learning, statistical analysis, algorithms

Abstract

Credit card fraud continues to be a major challenge in financial systems, causing substantial monetary losses and undermining trust in electronic transactions. Detecting fraudulent activities is complicated by the highly imbalanced nature of transaction datasets. This study evaluates the performance of four machine learning models including LinearSVC (Fast SVM), Logistic Regression, Decision Tree, and Random Forest using a Kaggle credit card dataset. Data preprocessing involved handling class imbalance using the Synthetic Minority Oversampling Technique (SMOTE) and standardizing feature values to improve model training stability. Each model was trained and assessed using metrics such as accuracy, precision, recall, F1-score, and AUC–ROC, with confusion matrices and ROC curves providing visual evaluation. Experimental results demonstrate that LinearSVC achieved the highest balance between precision (0.9983) and F1-score (0.9881), while Random Forest achieved near-perfect accuracy (0.9999) but slightly lower precision (0.8800) and recall (0.9002), highlighting trade-offs between overall accuracy and fraud detection sensitivity. The findings emphasize the importance of careful model selection and preprocessing in imbalanced financial datasets. Future work will investigate the integration of deep learning models and real-time fraud detection frameworks, as well as validation on additional datasets from diverse financial environments to enhance model robustness and generalizability.

References

Alarfaj, F. K., Malik, I., Khan, H. U., Almusallam, N., Ramzan, M., & Ahmed, M. (2022). Credit card fraud detection using state-of-the-art machine learning and deep learning algorithms. IEEE Access, 10. https://doi.org/10.1109/ACCESS.2022.3166891

Alraddadi, A. S. (2023). A survey and a credit card fraud detection and prevention model using the decision tree algorithm. Engineering, Technology and Applied Science Research, 13(4). https://doi.org/10.48084/etasr.6128

Azim, M., Majadi, N., & Mazumder, P. (2024). A soft voting ensemble learning approach for credit card fraud detection. Heliyon, 10(3). https://doi.org/10.1016/j.heliyon.2024.e25466

Baker, M. R., Mahmood, Z. N., & Shaker, E. H. (2022). Ensemble learning with supervised machine learning models to predict credit card fraud transactions. Revue d'Intelligence Artificielle, 36(4), 509–518. https://doi.org/10.18280/ria.360401

Breskuvienė, D., & Dzemyda, G. (2024). Enhancing credit card fraud detection: Highly imbalanced data case. Journal of Big Data, 11, 182. https://doi.org/10.1186/s40537-024-00902-9

Carcillo, F., Dal Pozzolo, A., Le Borgne, Y. A., Caelen, O., & Bontempi, G. (2021). Combining unsupervised and supervised learning in credit card fraud detection. Information Sciences, 557, 317–331. https://doi.org/10.1016/j.ins.2019.05.042

Dal Pozzolo, A., Caelen, O., Johnson, R. A., & Bontempi, G. (2015). Calibrating probability with undersampling for unbalanced classification. 2015 IEEE Symposium Series on Computational Intelligence. https://doi.org/10.1109/SSCI.2015.33

Du, H., Lv, L., Guo, A., & Wang, H. (2023). AutoEncoder and LightGBM for credit card fraud detection problems. Symmetry, 15(4). https://doi.org/10.3390/sym15040870

Ghaleb, F. A., Saeed, F., Al-Sarem, M., Qasem, S. N., & Al-Hadhrami, T. (2023). Ensemble synthesized minority oversampling-based generative adversarial networks and random forest algorithm for credit card fraud detection. IEEE Access, 11. https://doi.org/10.1109/ACCESS.2023.3306621

Hassan, H., Ahmad, M. A., & Mustapha, R. (2024). An enhanced feature engineering technique for credit card fraud detection. FUDMA Journal of Sciences, 8(4), 8-16.

Ileberi, E., Sun, Y., & Wang, Z. (2021). Performance evaluation of machine learning methods for credit card fraud detection using SMOTE and AdaBoost. IEEE Access, 9. https://doi.org/10.1109/ACCESS.2021.3134330

Kumari, V., & Singh, A. (2022). A machine learning-based credit card fraud detection using the GA algorithm for feature selection. Journal of Big Data, 9, 24. https://doi.org/10.1186/s40537-021-00552-4

Sundaravadivel, P., Isaac, R. A., Elangovan, D., KrishnaRaj, D., Lokesh Rahul, V. V., & Raja, R. (2025). Optimizing credit card fraud detection with random forests and SMOTE. Scientific Reports, 15, 17851. https://doi.org/10.1038/s41598-025-62015-3

Umaru, I. A., Aliyu, A. A., Ibrahim, M., Abdulkadir, S., Ahmed, M. A., Abubakar, M. A., ... & Tanko, S. A. (2025). An enhanced hybrid model combining LSTM, ResNet, and an attention mechanism for credit card fraud detection. FUDMA JOURNAL OF SCIENCES, 9(2), 42-48.

Xie, Z., & Huang, X. (2024). A credit card fraud detection method based on Mahalanobis distance hybrid sampling and random forest algorithm. IEEE Access, 12. https://doi.org/10.1109/ACCESS.2024.3421316

Class Distribution before SMOTE

Downloads

Published

04-02-2026

How to Cite

Ojo, O. F., Otaru, P. O., O.Job, E., & Onoghojobi, B. (2026). COMPARATIVE PERFORMANCE OF MACHINE LEARNING MODELS FOR CREDIT CARD FRAUD DETECTION ON IMBALANCED DATA: A STUDY USING SMOTE AND THE KAGGLE EUROPEAN DATASET. FUDMA JOURNAL OF SCIENCES, 10(3), 102-108. https://doi.org/10.33003/fjs-2026-1003-4745