• Azeez A. Nureni
  • O. E. Adekola
Keywords: Loan, Prediction, Machine-learning, Bank, Algorithm, defaulters, dataset


Banks have various goods to sell in the banking system. The major source of income and profit, however, is their credit lines. As a result, they can profit from the interest on the loans they credit. The profit or loss of a bank is mostly determined by loans, that is, whether consumers repay the loan or default. The bank can lower its Non-Performing Assets by forecasting loan defaulters. Previous research in this age has revealed that there are numerous techniques for studying the subject of loan default control. However, because accurate forecasts are critical for profit maximization, it is critical to investigate the nature of the various methodologies and compare them. In this research, the datasets used were gathered from Kaggle for training and testing. The results gotten from both datasets were compared to ascertain which algorithm could best be used for predicting loan approval and also to determine which features are most important in predicting loan approval. The different metrics of performance that were used to define the results are: Accuracy, Precision, Recall and F1-Score. Eight different algorithms were used to train the models, these are: the Logistic Regression algorithm, Random forest, Decision trees, Linear Regression, Support Vector Machine (SVM), Naïve Bayes, K-means and K Nearest Neighbors (KNN) algorithms. The final results revealed that the models generated varied outcomes. From the results shown across both datasets, Logistic regression had - 83.24% and 78.13% of  accuracy, followed by Naïve Bayes with 82.16% and 77.34% accuracy level, Random Forest


Fenjiro, Y., (2018), Machine Learning for Banking: Loan approval use case. [Online] Medium. Available at [Accessed 20 May. 2020]

Goyal, A. and Kaur, R., 2016. Accuracy Prediction for Loan Risk Using Machine Learning Models. Int. J. Comput. Sci. Trends Technol, 4(1), pp.52-57.

Goyal, A. and Kaur, R., 2016. A survey on ensemble model for loan prediction. International Journal of Engineering Trends and Applications (IJETA), 3(1), pp.32-37.

Hamid, A.J. and Ahmed, T.M., 2016. Developing prediction model of loan risk in banks using data mining. Machine Learning and Applications: An International Journal (MLAIJ) Vol. 3(1).

Kumar, R., Jain, V., Sharma, P. S., Awasthi, S. and Jha, G. (2019) “Prediction of Loan Approval using Machine Learning”, International Journal of Advanced Science and Technology, 28(7), pp. 455 – 460. Available at: (Accessed: 5, June, 2020).

Mayo, H., Punchihew, H., Emile, J. and Morrison, J., (2018) History of Machine Learning viewed 28 March 2020.

Nils, N., 2021. MLBOOK – INTRODUCTION TO MACHINE LEARNING AN EARLY DRAFT OF A PROPOSED TEXTBOOK Nils J Nilsson Robotics Laboratory Department of Computer Science | Course Hero. [Online] Available at: [Accessed 24 March 2021].

Niu B, Ren J, Li X. (2019) Credit Scoring Using Machine Learning by Combing Social Network Information: Evidence from Peer-to-Peer Lending. Information; 10(12):397

Soni, P.M. and Paul, V., 2019, A Novel Hybrid Classification Model for the Loan Repayment Capability Prediction System. International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-8, Issue-1S4

Sudhamathy, G., 2016. Credit risk analysis and prediction modelling of bank loans using R. International Journal of Engineering and Technology (IJET), 8(5), pp.1-13.

Vaidya, A., 2017, July. Predictive and probabilistic approach using logistic regression: Application to prediction of loan approval. In 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT) (pp. 1-6). IEEE.

Zhao, Z., Xu, S., Kang, B.H., Kabir, M.M.J., Liu, Y. and Wasinger, R., 2015. Investigation and improvement of multi-layer perceptron neural networks for credit scoring. Expert Systems with Applications, 42(7), pp.3508-3516.

Zhu, L., Qiu, D., Ergu, D., Ying, C. and Liu, K., 2019. A study on predicting loan default based on the random forest algorithm. Procedia Computer Science, 162, pp.503-513.
How to Cite