PREDICTING TIMELY GRADUATION OF POSTGRADUATE STUDENTS USING RANDOM FORESTS ENSEMBLE METHOD

  • Hafsat Sabiu Bako
  • Faruku Umar Ambursa Bayero University Kano
  • Bashir Shehu Galadanci Bayero University
  • Muhammad Garba Kebbi State University of Science and Technology
Keywords: Educational Data Mining (EDM), Student Performance Prediction, Machine Learning, Ensemble Learning

Abstract

Graduation time of students, both undergraduate and postgraduate, has been a prime focus in universities recently. Over the years, there have been numerous research on using data mining techniques to forecast undergrad students' success. However, very few works have been reported on predicting graduation time of postgrads, particularly using data from Nigerian Universities. This research utilized classification techniques using supervised learning to develop a Postgraduate Student Graduation Time Prediction Model (PS_GTPM). Data was collected from Bayero University Kano and the Adaptive synthetic sampling (ADASYN) technique was applied to address the imbalance issue with the data. Then, the model was developed using the Random Forests ensemble technique. From the evaluation results, we found that the data balancing method based on ADASYN technique enhanced the ability of the data mining classifiers to forecast when students will graduate. Also, it was found that the proposed PS_GTPM based on Random Forests Ensemble Method recorded the highest prediction accuracy with more than 83% score compared to the other methods. Largely, PS_GTPM can be used to forecast whether a thesis-based graduate study shall be completed on-time or not.

References

Agbonlahor, O. (2022). Multilevel Analysis of Factors Predicting International Doctoral Students’ Time-to-Degree Completion. Journal of Graduate Education Research, 3(1), 7. https://scholarworks.harding.edu/jger/vol3/iss1/7/

Ahmed, S., Mahbub, A., Rayhan, F., Jani, R., Shatabda, S., & Farid, D. M. (2017, December). Hybrid methods for class imbalance learning employing bagging with sampling techniques. In 2017 2nd International Conference on Computational Systems and Information Technology for Sustainable Solution (CSITSS) (pp. 1-5). IEEE. 10.1109/CSITSS.2017.8447799 DOI: https://doi.org/10.1109/CSITSS.2017.8447799

Amida, A., Algarni, S., & Stupnisky, R. (2020). Testing the relationships of motivation, time management and career aspirations on graduate students’ academic success. Journal of Applied Research in Higher Education. https://doi.org/10.1108/JARHE-04-2020-0106 DOI: https://doi.org/10.1108/JARHE-04-2020-0106

Baashar, Y., Hamed, Y., Alkawsi, G., Capretz, L. F., Alhussian, H., Alwadain, A., & Al-amri, R. (2022). Evaluation of postgraduate academic performance using artificial intelligence models. Alexandria Engineering Journal, 61(12), 9867-9878. https://doi.org/10.1016/j.aej.2022.03.021 DOI: https://doi.org/10.1016/j.aej.2022.03.021

Baker, R. S., & Yacef, K. (2009). The state of educational data mining in 2009: A review and future visions. Journal of educational data mining, 1(1), 3-17. https://doi.org/10.5281/zenodo.3554657

Breiman, L. (2004). Consistency for a simple model of random forests. Statistical Department. University of California at Berkeley. Technical Report,(670). https://www.stat.berkeley.edu/~breiman/RandomForests/consistencyRFA.pdf

Brennan, J. (2019, December 10). Dealing with imbalanced Data. Digital Catapult. https://medium.com/digital-catapult/dealing-with-imbalanced-data-8b21e6deb6cd

Finlay, S. (2011). Multiple classifier architectures and their application to credit risk assessment. European Journal of Operational Research, 210(2), 368-378. https://doi.org/10.1016/j.ejor.2010.09.029 DOI: https://doi.org/10.1016/j.ejor.2010.09.029

Gareth, James; Witten, Daniela; Hastie, Trevor; Tibshirani, Robert (2015). An Introduction to Statistical Learning. New York: Springer. pp. 315. ISBN 978-1-4614-7137-0. https://link.springer.com/book/10.1007/978-1-0716-1418-1

Gbolagade, M. D., Hambali, M. A., & Akinyemi, A. A. (2015). Predicting postgraduate performance using resample preprocess algorithm and artificial neural network. African Journal of Computing & ICT, 8(1), 145-158.

https://afrjcict.net/wp-content/uploads/2017/08/vol-8-no-1-issue-2-may-2015.pdf

Goenner, C. F., & Snaith, S. M. (2004). Predicting graduation rates: An analysis of student and institutional factors at doctoral universities. Journal of College Student Retention: Research, Theory & Practice, 5(4), 409-420.

https://doi.org/10.2190/LKJX-CL3H-1AJ5-WVPE DOI: https://doi.org/10.2190/LKJX-CL3H-1AJ5-WVPE

Hadi, N. U., & Muhammad, B. (2019). Factors Influencing Postgraduate Students' Performance: A high order top down structural equation modelling approach. Educational Sciences: Theory & Practice, 19(2). https://doi.org/10.12738/estp.2019.2.004 DOI: https://doi.org/10.12738/estp.2019.2.004

Johnson, J. M., & Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6(1), 1-54.

https://doi.org/10.1186/s40537-019-0192-5 DOI: https://doi.org/10.1186/s40537-019-0192-5

Kalmegh, S. (2015). Analysis of weka data mining algorithm reptree, simple cart and randomtree for classification of indian news. International Journal of Innovative Science, Engineering & Technology, 2(2), 438-446.

https://ijiset.com/vol2/v2s2/IJISET_V2_I2_63.pdf

Knutson, R. (2020). Knutson, R. (2020). Demographic and Academic Factors that Predict Degree Attainment for STEM Masters’ Students at a Midwestern Public University (Doctoral dissertation, University of South Dakota).

https://www.proquest.com/openview/118132ec36bef65cedb6d15f64764a0c/1?pq-origsite=gscholar&cbl=18750&diss=y

Muthukrishnan, P., Sidhu, G. K., Hoon, T. S., Narayanan, G., & Fook, C. Y. (2022). Key Factors Influencing Graduation on Time Among Postgraduate Students: A PLS-SEM Approach. Asian Journal of University Education (AJUE), 18(1).

https://doi.org/10.24191/ajue.v18i1.17169 DOI: https://doi.org/10.24191/ajue.v18i1.17169

Ngozi, A., & Kayode, O. G. (2014). Variables attributed to delay in thesis completion by postgraduate students. Journal of Emerging Trends in Educational Research and Policy Studies, 5(1), 6-13.

https://hdl.handle.net/10520/EJC150461

Nisbet, R., Elder, J., & Miner, G. D. (2009). Handbook of statistical analysis and data mining applications. Academic press.

https://doi.org/10.1016/B978-0-12-374765-5.X0001-0 DOI: https://doi.org/10.1016/B978-0-12-374765-5.X0001-0

Olakulehin, F. K., & Ojo, O. D. (2008). Factors influencing the completion of dissertations by students of Post-Graduate Diploma in Education (PGDE) by distance learning in South-western Nigeria. The journal for open and distance education and educational technology, 4(1), 37-41.

https://doi.org/10.12681/jode.9722 DOI: https://doi.org/10.12681/jode.9722

Osmanbegovic, E., & Suljic, M. (2012). Data mining approach for predicting student performance. Economic Review: Journal of Economics and Business, 10(1), 3-12. http://hdl.handle.net/10419/193806

Polikar, R. (2006). Ensemble based systems in decision making. IEEE Circuits and systems magazine, 6(3), 21-45. 10.1109/MCAS.2006.1688199 DOI: https://doi.org/10.1109/MCAS.2006.1688199

Sasaki, Y. (2007). The truth of the F-measure. Teach tutor mater, 1(5), 1-5.

https://nicolasshu.com/assets/pdf/Sasaki_2007_The%20Truth%20of%20the%20F-measure.pdf

Shariff, S. S. R., Rodzi, N. A. M., Rahman, K. A., Zahari, S. M., & Deni, S. M. (2016, October). Predicting the “graduate on time (GOT)” of PhD students using binary logistics regression model. In AIP Conference Proceedings (Vol. 1782, No. 1, p. 050015). AIP Publishing LLC.

https://doi.org/10.1063/1.4966105 DOI: https://doi.org/10.1063/1.4966105

Suhaimi, N. M., Abdul-Rahman, S., Mutalib, S., Hamid, N. H. A., & Ab Malik, A. M. (2019, August). Predictive Model of Graduate-On-Time Using Machine Learning Algorithms. In International Conference on Soft Computing in Data Science (pp. 130-141). Springer, Singapore. https://doi.org/10.1007/978-981-15-0399-3_11 DOI: https://doi.org/10.1007/978-981-15-0399-3_11

Suhaimi, N. M., Abdul-Rahman, S., Mutalib, S., Abdul Hamid, N. H., & Malik, A. M. A. (2019). Review on Predicting Students' Graduation Time Using Machine Learning Algorithms. International Journal of Modern Education & Computer Science, 11(7). DOI: https://doi.org/10.5815/ijmecs.2019.07.01

5815/ijmecs.2019.07.01.

Tampakas, V., Livieris, I. E., Pintelas, E., Karacapilidis, N., & Pintelas, P. (2018, June). Prediction of students’ graduation time using a two-level classification algorithm. In International Conference on Technology and Innovation in Learning, Teaching and Education (pp. 553-565). Springer, Cham. https://doi.org/10.1007/978-3-030-20954-4_42 DOI: https://doi.org/10.1007/978-3-030-20954-4_42

Thakar, P., & Mehta, A. (2015). Performance analysis and prediction in educational data mining: A research travelogue. arXiv preprint arXiv:1509.05176.

https://doi.org/10.48550/arXiv.1509.05176

Published
2023-07-08
How to Cite
Bako H. S., Ambursa F. U., Galadanci B. S., & Garba M. (2023). PREDICTING TIMELY GRADUATION OF POSTGRADUATE STUDENTS USING RANDOM FORESTS ENSEMBLE METHOD. FUDMA JOURNAL OF SCIENCES, 7(3), 177 - 185. https://doi.org/10.33003/fjs-2023-0703-1773