REVIEW OF CLASSIFICATION METHODS FOR MACHINE LEARNING MODELING

Authors

  • Agu C. Sunday Department of Computer Science, Benson Idahosa University, Benin, Edo State, Nigeria.
  • Augustina M. Ukeje-Okorie Department of Computer Science, Federal College of Agriculture, Ishiagu, Ebonyi State, Nigeria.

DOI:

https://doi.org/10.33003/fjs-2026-1009-5315

Keywords:

Decision tree, Ensembles, Logistic regression, Machine Learning, Predictive Analysis, Support Vector Machine

Abstract

Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree (DT) methods and their corresponding ensembles, strengths and weaknesses were surveyed, considering machine learning accuracy matrices, probability estimation as well as other features, and determining the frequency of their applications in different sectors. Descriptive statistics was used to analyze the survey which revealed that SVM yielded better predictive performance in different application areas. The study pointed out that no algorithm out rightly outperforms the other as the performance depends on the dataset, and recommended further work on the ensembles of DT and LR to leverage their fundamental advantages. Following the report on the strength and weaknesses, a simple workflow was suggested. Finally, the application areas of the algorithms were reviewed which pointed out that social media is not primarily an application area but a web application for generating datasets for predictive analytics.

References

Ahmad, A., Jafar, A., & Aljoumaa, K. (2019, March 20). Customer churn prediction in telecom using machine learning in the big data platform. Journal of Big Data volume, 6(26). Retrieved February 21, 2020, from https://link.springer.com/article/10.1186/s40537-019-0191-6#auth-1

Ahmad, L. G., Eshlaghy, A. T., Poorebrahimi, A., Ebrahimi, M., & Razavi, A. R. (2013). Using Three Machine Learning Techniques for Predicting Breast Cancer. Health & Medical Informatics, 4(2). Retrieved February 4, 2020, from https://s3.amazo naws.com/academia.edu.documents

Anwaar, A., Junaid, Q., Raihan, u. R., Arjuna, S. A., & Jon, C. (2016). Big data for development: applications. Big Data Analytics (Cross Mark), 1(2). Retrieved 20 February 2020, from https://bdataanal ytics.biomedcentral.com/track/pdf/10.1186/s41044-016-0002-4

Arno, D. C., Kristof, C., & Koen, D. B. (2018, September 1). A new hybrid classification algorithm for customer churn prediction. European Journal of Operational Research, 269(2), 760-772. Retrieved February 4, 2020, from https://www.sciencedirect. com/science/article/abs/pii/S0377221718301243

Banumathi, S., & Aloysius, A. (2017). Predictive Analytics Concepts In Big Data- A Survey. International Journal of Advanced Research in Computer Science, 8(8), 27-30. Retrieved February 25, 2020, from http://ijarcs.info/index.php/Ijarcs/ article/view/4628

Brownlee, J. (2019, May 20). Why One-Hot Encode Data in Machine Learning? Retrieved from Machine Learning Mastery: https://machinelearningmastery. com/why-one-hot-encode-data-in-machine-learning/

Cagatay, c., & Nangir, M. (2017, January). A sentiment classification model based on multiple classifiers. Applied Soft Computing, 50, 135-141. Retrieved February 4, 2020, from https://www.sciencedirect. com/science/article/abs/pii/S1568494616305919

Claudia, P. (2017, June 19). What Are The Advantages Of Logistic Regression Over Decision Trees? Quora. Retrieved January 10, 2020, from https://www.forbes. com/sites/quora/2017/06/19/what-are-the-advantages-of-logistic-regression-over-decision-trees/#198b0f922 c35

Danny, V. (2018, December 6). Comparative Study on Classic Machine learning Algorithms. Retrieved from Towards Data Science: https://towardsdatascience. com/comparative-study-on-classic-machine-learning-algorithms-24f9ff6ab222

Dhiraj, K. (2019, May 26). Top 5 advantages and disadvantages of the Decision Tree Algorithm.Retrieved from Medium: https://medium.com/@ dhiraj8899/top-5-advantages-and-disadvantages-of-decision-tree-algorithm-428ebd199d9a

Elite Data Science. (2019, February 10). Modern Machine Learning Algorithms: Strengths and Weaknesses. Retrieved from https://elitedatascience.com/machine-learning-algorithms

Ellen, T. (2019, February 20). What are the benefits of white-box models in machine learning? Retrieved from the Silicon Republic: https://www.siliconre public.com/enterprise/white-box-machine-learning

Fatimetou, Z. M. (2017). THE APPLICATION OF PREDICTIVE. International Journal of Scientific and Research Publication, 7(5), 549-566. Retrieved from http://www.ijsrp.org/research-paper-0517/ijsrp-p656 4.pdf

Gang, W., & Ma, J. (2012, April 5). A hybrid ensemble approach for enterprise credit risk assessment based. Expert Systems with Applications, 39(5), 5325-5331. Retrieved from https://www.sciencedirect.com/ science/article/pii/S0957417411015338

Jack, R. (2019). What are the advantages of logistic regression over decision trees? Are there any cases where it's better to use logistic regression instead of decision trees? Quora. Retrieved January 15, 2020, from https://www.quora.com/What-are-the-advantag es-of-logistic-regression-over-decision-trees-Are-there-any-cases-where-its-better-to-use-logistic-regression-instead-of-decision-trees

Jong-Myon, B. (2014). The clinical decision analysis using a decision tree. Epidemiology and Health, 36, 1-7. Retrieved February 20, 2020, from https://www. ncbi.nlm.nih.gov/pmc/articles/PMC4251295/

Joshi, K. P., Richa, S., & Aniruddha, G. (2013). Decision tree approach for classification of remotely. J. Earth Syst. Sci. (Indian Academy of Sciences), 122(5), 1237–1247. Retrieved February 20, 2020, from https://www.ias.ac.in/article/fulltext/jess/122/05/1237-1247

Koutina, M., & Kermanidis, K. L. (2011). Predicting Postgraduate Students’ Performance Using Machine Learning Technique. IFIP Advances in Information and Communication Technology, 364, 159-168. Retrieved February 5, 2020, from https://link. springer.com/chapter/10.1007/978-3-642-23960-1_20 #citeas

Kyoungok, K. (2016). A hybrid classification algorithm by subspace partitioning through a semi-supervised decision tree. Pattern Recognition, 60, 157-163. Retrieved February 4, 2020, from https://www.sci encedirect.com/science/article/abs/pii/S00313203163 00620

Lakshay, S., & Prakita, T. (2017). Predictive Modelling Analytics through Data Mining. International Research Journal of Engineering and Technology (IRJET), 4(9), 5-11. Retrieved January 10, 2020, from https://pdfs.semanticscholar.org/adf7/cf1e5b770 2cbf1dff223862171ee59cb65d8.pdf

Lalit, S. (2015, October 5). Logistic Regression Vs Decision Trees Vs SVM: Part I. Edvancer. Retrieved January 13, 2020, from https://www.edvancer.in/ logistic-regression-vs-decision-trees-vs-svm-part1/

Lalit, S. (2015, December 6). Logistic Regression vs Decision Trees vs SVM: Part II. Retrieved from Edvancer Eduventures: https://www.edvancer.in/ logistic-regression-vs-decision-trees-vs-svm-part2/

Lars, H. (2019, March 14). Black-box vs. white-box models. Retrieved from Towards Data Science: https://towardsdatascience.com/machine-learning-interpretability-techniques-662c723454f3

Lynch, C. M., Abdollahi, B., Fuqua, J. D., de Carlo, A. R., Bartholomai, J. A., Balgemann, R. N., & Frieboes, H. B. (2017). Prediction of lung cancer patient survival via supervised machine learning. International Journal of Medical Informatics, 1-8. Retrieved February 4, 2020, from https://www.sciencedirect. com/science/article/abs/pii/S1386505617302368

Milos, M., Milos, K., Branislav, B., & Vít, V. (2011). Landslide susceptibility assessment using the SVM machine learning algorithm. Engineering Geology, 225-234. Retrieved February 4, 2020, from https://www.sciencedirect.com/science/article/abs/pii/ S0013795211002195

Mythili, T., Dev, M., Nikita, P., & Abhiram, N. (2013). A Heart Disease Prediction Model using SVM-Decision Trees-Logistic Regression (SDL). International Journal of Computer Applications in Technology, 68(16), 11-15. Retrieved from https://www.research gate.net/publication/273261237_A_Heart_Disease_Pr ediction_Model_using_SVM-Decision_Trees-Logistic_Regression_SDL

Nadeem, A. N., Umar, S., & Shahzad, S. M. (2018). A Review on Customer Churn Prediction Data Mining Modeling Techniques. Indian Journal of Science & Technology, 11(27). Retrieved February 21, 2020, from http://www.indjst.org/index.php/indjst/article/ view/121478

Niels, B. L., Lisbeth, L. C., & Ravi, V. (2016). Predictive Analytics with Social Media Data. In A. Quan-Haase (Ed.), The SAGE Handbook of Social Media Research Methods (pp. 328-341). Retrieved 2 15,2020, from https://www.inet.ox.ac.uk/files/ SLOAN_QUAN-HAASE-Chp20_2pp.pdf

Osisanwo, F., Akinsola, J., Awodele, O., Hinmikaiye, J. O., Olakanmi, O., & J., A. (2017, June). Supervised Machine Learning Algorithms: Classification and Comparison. International Journal of Computer Trends and Technology (IJCTT), 48(3), 128-138.Retrieved Feb 4, 2020, from https://www.research gate.net/publication/318338750_Supervised_Machine_Learning_Algorithms_Classification_and_Comparis on

Parneet, K., Manpreet, S., & Gurpreet, S. J. (2015). Classification and prediction based data mining algorithms to. Procedia Computer Science.57, pp. 500-508. Elsevier (Science Direct). Retrieved February 21, 2020, from https://www.sciencedirect. com/science/article/pii/S1877050915019018

Queirozf. (2019, January 24). Evaluation Metrics for Ranking problems: Introduction and Examples. Retrieved from queirozf.com: http://queirozf.com/ entries/evaluation-metrics-for-ranking-problems-introduction-and-examples

Rajesh, B. S. (2018, October 26). Introduction to Decision Trees: Retrieved from Greyatom: https://medium. com/greyatom/decision-trees-a-simple-way-to-visualize-a-decision-dc506a403aeb

Roberto, L. A. (2020). Six Applications of Predictive Analytics in Business Intelligence. Retrieved February 15, 2020, from Neural Designer: https://www.neuraldesigner.com/blog/6_Applications_of_predictive_analytics_in_business_intelligence

Scikit Learn. (2019). Probability calibration. Retrievedfrom Scikit Learn: https://scikit-learn.org/stable/ modules/calibration.html

Scikit-learn. (2019, February 11). Decision Trees. Retrieved from https://scikit-learn.org/stable/ modules/tree.html

Smriti, D. (2019, May 28). Top 5 Predictive Models and Their Applications. Retrieved from Grazitti Interactive: https://www.grazitti.com/blog/top-5-predictive-models-and-their-applications/

Stecanella, B. (2017, July 22). An introduction to Support Vector Machines (SVM). Retrieved February 29, 2020, from Monkey Learn: https://monkeylearn. com/blog/introduction-to-support-vector-machines

Sunil, R. (2017, September 9). Commonly used Machine Learning Algorithms (with Python and R Codes). Retrieved from Analytics Vidhya: https://www.analy ticsvidhya.com/blog/2017/09/common-machine-learning-algorithms/

Tehrany, M. S., Pradhan, B., & Jebur, M. N. (2015). Flood susceptibility analysis and its verification using a novel ensemble support vector machine and frequency ratio method. Stochastic Environmental Research and Risk Assessment, 29(4), 1149–1165.

Retrieved February 4, 2020, from https://link. springer.com/article/10.1007/s00477-015-1021-9#cit

Frequency Distribution for Usage of DT, LR, and SVM

Downloads

Published

16-06-2026

How to Cite

Sunday, A. C., & Ukeje-Okorie, A. M. (2026). REVIEW OF CLASSIFICATION METHODS FOR MACHINE LEARNING MODELING. FUDMA JOURNAL OF SCIENCES, 10(9), 103-109. https://doi.org/10.33003/fjs-2026-1009-5315