PREDICTING TIMELY GRADUATION OF POSTGRADUATE STUDENTS USING RANDOM FORESTS ENSEMBLE METHOD
Abstract
Graduation time of students, both undergraduate and postgraduate, has been a prime focus in universities recently. Over the years, there have been numerous research on using data mining techniques to forecast undergrad students' success. However, very few works have been reported on predicting graduation time of postgrads, particularly using data from Nigerian Universities. This research utilized classification techniques using supervised learning to develop a Postgraduate Student Graduation Time Prediction Model (PS_GTPM). Data was collected from Bayero University Kano and the Adaptive synthetic sampling (ADASYN) technique was applied to address the imbalance issue with the data. Then, the model was developed using the Random Forests ensemble technique. From the evaluation results, we found that the data balancing method based on ADASYN technique enhanced the ability of the data mining classifiers to forecast when students will graduate. Also, it was found that the proposed PS_GTPM based on Random Forests Ensemble Method recorded the highest prediction accuracy with more than 83% score compared to the other methods. Largely, PS_GTPM can be used to forecast whether a thesis-based graduate study shall be completed on-time or not.
References
Agbonlahor, O. (2022). Multilevel Analysis of Factors Predicting International Doctoral Students’ Time-to-Degree Completion. Journal of Graduate Education Research, 3(1), 7. https://scholarworks.harding.edu/jger/vol3/iss1/7/
Ahmed, S., Mahbub, A., Rayhan, F., Jani, R., Shatabda, S., & Farid, D. M. (2017, December). Hybrid methods for class imbalance learning employing bagging with sampling techniques. In 2017 2nd International Conference on Computational Systems and Information Technology for Sustainable Solution (CSITSS) (pp. 1-5). IEEE. 10.1109/CSITSS.2017.8447799 DOI: https://doi.org/10.1109/CSITSS.2017.8447799
Amida, A., Algarni, S., & Stupnisky, R. (2020). Testing the relationships of motivation, time management and career aspirations on graduate students’ academic success. Journal of Applied Research in Higher Education. https://doi.org/10.1108/JARHE-04-2020-0106 DOI: https://doi.org/10.1108/JARHE-04-2020-0106
Baashar, Y., Hamed, Y., Alkawsi, G., Capretz, L. F., Alhussian, H., Alwadain, A., & Al-amri, R. (2022). Evaluation of postgraduate academic performance using artificial intelligence models. Alexandria Engineering Journal, 61(12), 9867-9878. https://doi.org/10.1016/j.aej.2022.03.021 DOI: https://doi.org/10.1016/j.aej.2022.03.021
Baker, R. S., & Yacef, K. (2009). The state of educational data mining in 2009: A review and future visions. Journal of educational data mining, 1(1), 3-17. https://doi.org/10.5281/zenodo.3554657
Breiman, L. (2004). Consistency for a simple model of random forests. Statistical Department. University of California at Berkeley. Technical Report,(670). https://www.stat.berkeley.edu/~breiman/RandomForests/consistencyRFA.pdf
Brennan, J. (2019, December 10). Dealing with imbalanced Data. Digital Catapult. https://medium.com/digital-catapult/dealing-with-imbalanced-data-8b21e6deb6cd
Finlay, S. (2011). Multiple classifier architectures and their application to credit risk assessment. European Journal of Operational Research, 210(2), 368-378. https://doi.org/10.1016/j.ejor.2010.09.029 DOI: https://doi.org/10.1016/j.ejor.2010.09.029
Gareth, James; Witten, Daniela; Hastie, Trevor; Tibshirani, Robert (2015). An Introduction to Statistical Learning. New York: Springer. pp. 315. ISBN 978-1-4614-7137-0. https://link.springer.com/book/10.1007/978-1-0716-1418-1
Gbolagade, M. D., Hambali, M. A., & Akinyemi, A. A. (2015). Predicting postgraduate performance using resample preprocess algorithm and artificial neural network. African Journal of Computing & ICT, 8(1), 145-158.
https://afrjcict.net/wp-content/uploads/2017/08/vol-8-no-1-issue-2-may-2015.pdf
Goenner, C. F., & Snaith, S. M. (2004). Predicting graduation rates: An analysis of student and institutional factors at doctoral universities. Journal of College Student Retention: Research, Theory & Practice, 5(4), 409-420.
https://doi.org/10.2190/LKJX-CL3H-1AJ5-WVPE DOI: https://doi.org/10.2190/LKJX-CL3H-1AJ5-WVPE
Hadi, N. U., & Muhammad, B. (2019). Factors Influencing Postgraduate Students' Performance: A high order top down structural equation modelling approach. Educational Sciences: Theory & Practice, 19(2). https://doi.org/10.12738/estp.2019.2.004 DOI: https://doi.org/10.12738/estp.2019.2.004
Johnson, J. M., & Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6(1), 1-54.
https://doi.org/10.1186/s40537-019-0192-5 DOI: https://doi.org/10.1186/s40537-019-0192-5
Kalmegh, S. (2015). Analysis of weka data mining algorithm reptree, simple cart and randomtree for classification of indian news. International Journal of Innovative Science, Engineering & Technology, 2(2), 438-446.
https://ijiset.com/vol2/v2s2/IJISET_V2_I2_63.pdf
Knutson, R. (2020). Knutson, R. (2020). Demographic and Academic Factors that Predict Degree Attainment for STEM Masters’ Students at a Midwestern Public University (Doctoral dissertation, University of South Dakota).
Muthukrishnan, P., Sidhu, G. K., Hoon, T. S., Narayanan, G., & Fook, C. Y. (2022). Key Factors Influencing Graduation on Time Among Postgraduate Students: A PLS-SEM Approach. Asian Journal of University Education (AJUE), 18(1).
https://doi.org/10.24191/ajue.v18i1.17169 DOI: https://doi.org/10.24191/ajue.v18i1.17169
Ngozi, A., & Kayode, O. G. (2014). Variables attributed to delay in thesis completion by postgraduate students. Journal of Emerging Trends in Educational Research and Policy Studies, 5(1), 6-13.
https://hdl.handle.net/10520/EJC150461
Nisbet, R., Elder, J., & Miner, G. D. (2009). Handbook of statistical analysis and data mining applications. Academic press.
https://doi.org/10.1016/B978-0-12-374765-5.X0001-0 DOI: https://doi.org/10.1016/B978-0-12-374765-5.X0001-0
Olakulehin, F. K., & Ojo, O. D. (2008). Factors influencing the completion of dissertations by students of Post-Graduate Diploma in Education (PGDE) by distance learning in South-western Nigeria. The journal for open and distance education and educational technology, 4(1), 37-41.
https://doi.org/10.12681/jode.9722 DOI: https://doi.org/10.12681/jode.9722
Osmanbegovic, E., & Suljic, M. (2012). Data mining approach for predicting student performance. Economic Review: Journal of Economics and Business, 10(1), 3-12. http://hdl.handle.net/10419/193806
Polikar, R. (2006). Ensemble based systems in decision making. IEEE Circuits and systems magazine, 6(3), 21-45. 10.1109/MCAS.2006.1688199 DOI: https://doi.org/10.1109/MCAS.2006.1688199
Sasaki, Y. (2007). The truth of the F-measure. Teach tutor mater, 1(5), 1-5.
https://nicolasshu.com/assets/pdf/Sasaki_2007_The%20Truth%20of%20the%20F-measure.pdf
Shariff, S. S. R., Rodzi, N. A. M., Rahman, K. A., Zahari, S. M., & Deni, S. M. (2016, October). Predicting the “graduate on time (GOT)” of PhD students using binary logistics regression model. In AIP Conference Proceedings (Vol. 1782, No. 1, p. 050015). AIP Publishing LLC.
https://doi.org/10.1063/1.4966105 DOI: https://doi.org/10.1063/1.4966105
Suhaimi, N. M., Abdul-Rahman, S., Mutalib, S., Hamid, N. H. A., & Ab Malik, A. M. (2019, August). Predictive Model of Graduate-On-Time Using Machine Learning Algorithms. In International Conference on Soft Computing in Data Science (pp. 130-141). Springer, Singapore. https://doi.org/10.1007/978-981-15-0399-3_11 DOI: https://doi.org/10.1007/978-981-15-0399-3_11
Suhaimi, N. M., Abdul-Rahman, S., Mutalib, S., Abdul Hamid, N. H., & Malik, A. M. A. (2019). Review on Predicting Students' Graduation Time Using Machine Learning Algorithms. International Journal of Modern Education & Computer Science, 11(7). DOI: https://doi.org/10.5815/ijmecs.2019.07.01
5815/ijmecs.2019.07.01.
Tampakas, V., Livieris, I. E., Pintelas, E., Karacapilidis, N., & Pintelas, P. (2018, June). Prediction of students’ graduation time using a two-level classification algorithm. In International Conference on Technology and Innovation in Learning, Teaching and Education (pp. 553-565). Springer, Cham. https://doi.org/10.1007/978-3-030-20954-4_42 DOI: https://doi.org/10.1007/978-3-030-20954-4_42
Thakar, P., & Mehta, A. (2015). Performance analysis and prediction in educational data mining: A research travelogue. arXiv preprint arXiv:1509.05176.
Copyright (c) 2023 FUDMA JOURNAL OF SCIENCES
This work is licensed under a Creative Commons Attribution 4.0 International License.
FUDMA Journal of Sciences