COMPARATIVE PERFORMANCE OF MACHINE LEARNING MODELS FOR CREDIT CARD FRAUD DETECTION ON IMBALANCED DATA: A STUDY USING SMOTE AND THE KAGGLE EUROPEAN DATASET
DOI:
https://doi.org/10.33003/fjs-2026-1003-4745Keywords:
Credit card, fraud, machine learning, statistical analysis, algorithmsAbstract
Credit card fraud continues to be a major challenge in financial systems, causing substantial monetary losses and undermining trust in electronic transactions. Detecting fraudulent activities is complicated by the highly imbalanced nature of transaction datasets. This study evaluates the performance of four machine learning models including LinearSVC (Fast SVM), Logistic Regression, Decision Tree, and Random Forest using a Kaggle credit card dataset. Data preprocessing involved handling class imbalance using the Synthetic Minority Oversampling Technique (SMOTE) and standardizing feature values to improve model training stability. Each model was trained and assessed using metrics such as accuracy, precision, recall, F1-score, and AUC–ROC, with confusion matrices and ROC curves providing visual evaluation. Experimental results demonstrate that LinearSVC achieved the highest balance between precision (0.9983) and F1-score (0.9881), while Random Forest achieved near-perfect accuracy (0.9999) but slightly lower precision (0.8800) and recall (0.9002), highlighting trade-offs between overall accuracy and fraud detection sensitivity. The findings emphasize the importance of careful model selection and preprocessing in imbalanced financial datasets. Future work will investigate the integration of deep learning models and real-time fraud detection frameworks, as well as validation on additional datasets from diverse financial environments to enhance model robustness and generalizability.
References
Alarfaj, F. K., Malik, I., Khan, H. U., Almusallam, N., Ramzan, M., & Ahmed, M. (2022). Credit card fraud detection using state-of-the-art machine learning and deep learning algorithms. IEEE Access, 10. https://doi.org/10.1109/ACCESS.2022.3166891
Alraddadi, A. S. (2023). A survey and a credit card fraud detection and prevention model using the decision tree algorithm. Engineering, Technology and Applied Science Research, 13(4). https://doi.org/10.48084/etasr.6128
Azim, M., Majadi, N., & Mazumder, P. (2024). A soft voting ensemble learning approach for credit card fraud detection. Heliyon, 10(3). https://doi.org/10.1016/j.heliyon.2024.e25466
Baker, M. R., Mahmood, Z. N., & Shaker, E. H. (2022). Ensemble learning with supervised machine learning models to predict credit card fraud transactions. Revue d'Intelligence Artificielle, 36(4), 509–518. https://doi.org/10.18280/ria.360401
Breskuvienė, D., & Dzemyda, G. (2024). Enhancing credit card fraud detection: Highly imbalanced data case. Journal of Big Data, 11, 182. https://doi.org/10.1186/s40537-024-00902-9
Carcillo, F., Dal Pozzolo, A., Le Borgne, Y. A., Caelen, O., & Bontempi, G. (2021). Combining unsupervised and supervised learning in credit card fraud detection. Information Sciences, 557, 317–331. https://doi.org/10.1016/j.ins.2019.05.042
Dal Pozzolo, A., Caelen, O., Johnson, R. A., & Bontempi, G. (2015). Calibrating probability with undersampling for unbalanced classification. 2015 IEEE Symposium Series on Computational Intelligence. https://doi.org/10.1109/SSCI.2015.33
Du, H., Lv, L., Guo, A., & Wang, H. (2023). AutoEncoder and LightGBM for credit card fraud detection problems. Symmetry, 15(4). https://doi.org/10.3390/sym15040870
Ghaleb, F. A., Saeed, F., Al-Sarem, M., Qasem, S. N., & Al-Hadhrami, T. (2023). Ensemble synthesized minority oversampling-based generative adversarial networks and random forest algorithm for credit card fraud detection. IEEE Access, 11. https://doi.org/10.1109/ACCESS.2023.3306621
Hassan, H., Ahmad, M. A., & Mustapha, R. (2024). An enhanced feature engineering technique for credit card fraud detection. FUDMA Journal of Sciences, 8(4), 8-16.
Ileberi, E., Sun, Y., & Wang, Z. (2021). Performance evaluation of machine learning methods for credit card fraud detection using SMOTE and AdaBoost. IEEE Access, 9. https://doi.org/10.1109/ACCESS.2021.3134330
Kumari, V., & Singh, A. (2022). A machine learning-based credit card fraud detection using the GA algorithm for feature selection. Journal of Big Data, 9, 24. https://doi.org/10.1186/s40537-021-00552-4
Sundaravadivel, P., Isaac, R. A., Elangovan, D., KrishnaRaj, D., Lokesh Rahul, V. V., & Raja, R. (2025). Optimizing credit card fraud detection with random forests and SMOTE. Scientific Reports, 15, 17851. https://doi.org/10.1038/s41598-025-62015-3
Umaru, I. A., Aliyu, A. A., Ibrahim, M., Abdulkadir, S., Ahmed, M. A., Abubakar, M. A., ... & Tanko, S. A. (2025). An enhanced hybrid model combining LSTM, ResNet, and an attention mechanism for credit card fraud detection. FUDMA JOURNAL OF SCIENCES, 9(2), 42-48.
Xie, Z., & Huang, X. (2024). A credit card fraud detection method based on Mahalanobis distance hybrid sampling and random forest algorithm. IEEE Access, 12. https://doi.org/10.1109/ACCESS.2024.3421316
Downloads
Published
Issue
Section
Categories
License
Copyright (c) 2026 Opeyemi Femi Ojo, Paul Olawale Otaru, Eunice O.Job, Benson Onoghojobi

This work is licensed under a Creative Commons Attribution 4.0 International License.