AN ENHANCED FEATURE ENGINEERING TECHNIQUE FOR CREDIT CARD FRAUD DETECTION
Abstract
As the world is becoming a cashless society with increasing use of online transactions, the number of credit cards users has also increased substantially. This led to credit card fraud, which is among the major cybercrimes faced by users with consequential damages to financial institutions. Therefore, credit card fraud detection is crucial due to the increasing number of credit card transactions. Machine learning based credit card fraud detection systems exist, but machine learning approaches have problems with imbalanced data and the need to selected best features for effective classification. Imbalance classification occurs when there are small number of observations of the minority class compared with the majority in a dataset. This study addresses the challenges of feature selection and data imbalance in credit card fraud detection through an enhanced feature engineering method. We propose a technique that uses wrapper to select the best features and mitigate data imbalance using a hybrid approach that combines SMOTE, random oversampling and under-sampling techniques. Five popular machine learning classifiers—Random Forest, Naïve Bayes, K Nearest Neighbor, Decision Tree and Support Vector Machine—are used with balanced and imbalanced datasets to evaluate the technique. The results show significant improvements in accuracy, precision, recall, F1-score, and Kappa score with the enhanced method. Specifically, and K Nearest Neighbor, Random Forest and Support Vector Machine achieve perfect accuracy with the balanced data.
References
Akila, S. & Reddy, U. S., 2018. Cost-Sensitive Risk Induced Bayesian Inference Bagging (RIBIB) for Credit Card Fraud Detection. Journal of Computational Science, Volume 27, pp. 247-254. DOI: https://doi.org/10.1016/j.jocs.2018.06.009
Alkhatib, K. I.-A. (2021). Credit Card Fraud Detection Based on Deep Neural Network Approach. 12th International Conference on Information and Communication Systems (ICICS) (pp. 153-156). IEEE. DOI: https://doi.org/10.1109/ICICS52457.2021.9464555
Askari, S. M. S., & Hussain, M. A. (2020). IFDTC4. 5: Intuitionistic fuzzy logic based decision tree for E-transactional fraud detection. Journal of Information Security and Applications, 52, 102469. DOI: https://doi.org/10.1016/j.jisa.2020.102469
Carcillo, F. et al., 2019. Combining Unsupervised and Supervised Learning in Credit Card Fraud Detection. Information Sciences, pp. 10-11.
Debachudamani Prusti, S. S. Harshini Padmanabhuni, Santanu Kumar Rath (2020) Safety, Security, and Reliability of Robotic Systems, 1st Edition, 2020, Imprint CRC Press. eBook ISBN 9781003031352.
Dornadula, V. N., & Geetha, S. (2019). Credit card fraud detection using machine learning algorithms. Procedia computer science, 165, 631-641. DOI: https://doi.org/10.1016/j.procs.2020.01.057
El Aboudi, N., & Benhlima, L. (2016, September). Review on wrapper feature selection approaches. In 2016 international conference on engineering & MIS (ICEMIS) (pp. 1-5). IEEE. DOI: https://doi.org/10.1109/ICEMIS.2016.7745366
Maikano, F. A. (2024). Machine Learning Approaches for Cyber Bullying Detection In Hausa Language Social Media: A Comprehensive Review And Analysis. FUDMA Journal of Sciences, 8(3), 344-348.
Ileberi, E., Sun, Y., & Wang, Z. (2021). Performance evaluation of machine learning methods for credit card fraud detection using SMOTE and AdaBoost. IEEE Access, 9, 165286-165294. DOI: https://doi.org/10.1109/ACCESS.2021.3134330
Ileberi, E., Sun, Y., & Wang, Z. (2022). A machine learning based credit card fraud detection using the GA algorithm for feature selection. Journal of Big Data, 9(1), 24. DOI: https://doi.org/10.1186/s40537-022-00573-8
Kaggle (2024) [Online]. Available at https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud [Accessed 2 6 2024]
Lucas, Y., Partier, P.-E., Laporte, L. & He, L., 2020. Towards automated feature engineering for credit card fraud detection using multi-perspective HMMs. Future Generation Computer System, Volume 167, pp. 393-402. DOI: https://doi.org/10.1016/j.future.2019.08.029
Misra, S., Thankur, S., Ghosh, M. & Saha, S. K., 2020. An Autoencoder Based Model for Detecting Fraudulent Credit Card Transaction. Procedia Computer Science, Volume 102, pp. 254-262. DOI: https://doi.org/10.1016/j.procs.2020.03.219
Murli, D. J. (2015). Credit card fraud detection using neural networks. International Journal of Students’ Research in Technology & Management, 2(2), 84-88.
Raptis, T. P., & Passarella, A. (2023). A survey on networked data streaming with Apache Kafka. IEEE access, Volume 11, pp. 85333 - 85350 DOI: https://doi.org/10.1109/ACCESS.2023.3303810
Singh, A., Ranjan, R. K., & Tiwari, A. (2022). Credit card fraud detection under extreme imbalanced data: a comparative study of data-level algorithms. Journal of Experimental & Theoretical Artificial Intelligence, 34(4), 571-598. DOI: https://doi.org/10.1080/0952813X.2021.1907795
Tran, T. C., & Dang, T. K. (2021, January). Machine learning for prediction of imbalanced data: Credit fraud detection. In 2021 15th International Conference on Ubiquitous Information Management and Communication (IMCOM) (pp. 1-7). IEEE. DOI: https://doi.org/10.1109/IMCOM51814.2021.9377352
Valero-Carreras, D., Alcaraz, J., & Landete, M. (2023). Comparing two SVM models through different metrics based on the confusion matrix. Computers & Operations Research, 152, 106131. DOI: https://doi.org/10.1016/j.cor.2022.106131
Varmedja, D., Karanovic, M., Sladojevic, S., Arsenovic, M., & Anderla, A. (2019, March). Credit card fraud detection-machine learning methods. In 2019 18th International Symposium INFOTEH-JAHORINA (INFOTEH) (pp. 1-5). IEEE. DOI: https://doi.org/10.1109/INFOTEH.2019.8717766
Wongvorachan, T., He, S., & Bulut, O. (2023). A comparison of undersampling, oversampling, and SMOTE methods for dealing with imbalanced classification in educational data mining. Information, 14(1), 54. DOI: https://doi.org/10.3390/info14010054
Yazici, Y. (2020). Approaches to Fraud detection on credit card transactions using artificial DOI: https://doi.org/10.5121/csit.2020.101018
intelligence methods. arXiv preprint arXiv:2007.14622..
Zareapoor, M. &. (2015). Application of credit card fraud detection: Based on bagging ensemble classifier. Procedia computer science, 48, 679-685. DOI: https://doi.org/10.1016/j.procs.2015.04.201
Zhu, M., Zhang, Y., Gong, Y., Xu, C., & Xiang, Y. (2024). Enhancing Credit Card Fraud Detection A Neural Network and SMOTE Integrated Approach. arXiv preprint arXiv:2405.00026. DOI: https://doi.org/10.53469/jtpes.2024.04(02).04
Copyright (c) 2024 FUDMA JOURNAL OF SCIENCES
This work is licensed under a Creative Commons Attribution 4.0 International License.
FUDMA Journal of Sciences