• Folasade Mercy Okikiola Department of computer Science, School of Computing, Federal University of Technology, Akure, Ondo State. Nigeria
  • Olumide Sunday Adewale Department of Computer Science, School of Computing, Federal University of Technology, Akure, Ondo State. Nigeria
  • Olumide Olayinka Obe Department of Computer Science, School of Computing, Federal University of Technology, Akure, Ondo State. Nigeria
Keywords: Classification, Decision Tree, Diabetes, Naïve Bayes, Ontology


One serious health condition which has made people to suffer from uncontrollable high blood sugar is diabetes. The problems of existing detection approaches are data imbalance, feature selection, and lack of generic framework for diabetes classification. In this research, developed an ontology-based diabetes classification model using naïve Bayes classifier was developed. The model is divided into five modules: data collection, feature selection, ontology construction, classification, and document query. The data collection module adapted PIMA Indian Diabetes Database to predict diabetes. The feature selection module employed multi-step approach for selecting the most important features from dataset. For automatically constructing ontology rules based on the chosen features, the ontology generation module used a decision tree classifier. Based on the user's question, the classification module employed a Nave Bayes classifier to automatically classify the built ontology as having diabetes. Based on the ontology-based nave Bayes classification, the document query module searches and returns the anticipated documents requested by users. The proposed model using a 10-fold cross validation performed better in diabetes in precision, accuracy, recall and F1-score of 96.5%, 93.55%, 79.2% and 87.0%, respectively. Benchmarking tools included K-Nearest Neighbor (KNN), Decision Tree (DT), Multilayer Perceptron (MLP), Logistic Regression (LR), Hidden Markov Model (HMM), Support Vector Machine (SVM), Naive Bayes (NB), Random Forest (RF), and Deep Convolutional Neural Network (DCNN). With an area of 0.9578 in compared to other relevant methods, the created model suggested a more accurate test. They demonstrated that the model's cost-effectiveness for predicting diabetes outweighs its value.


Ahlqvist, E., Storm, P., Käräjämäki, A., Martinell, M., Dorkhan, M., Carlsson, A., Vikman, P., Prasad, R. B., Aly, D. M., Almgren, P., Wessman, Y., Shaat, N., Spégel, P., Mulder, H., Lindholm, E., Melander, O., Hansson, O., Malmqvist, U., Lernmark, Å., … Groop, L. (2018). Novel subgroups of adult-onset diabetes and their association with outcomes: a data-driven cluster analysis of six variables. The Lancet Diabetes and Endocrinology. https://doi.org/10.1016/S2213-8587(18)30051-2

Alex, S. A., Nayahi, J. J. V., Shine, H., & Gopirekha, V. (2022). Deep convolutional neural network for diabetes mellitus prediction. Neural Computing and Applications. https://doi.org/10.1007/s00521-021-06431-7

Bhutta, Z. A., Salam, R. A., Gomber, A., Lewis-Watts, L., Narang, T., Mbanya, J. C., & Alleyne, G. (2021). A century past the discovery of insulin: global progress and challenges for type 1 diabetes among children and adolescents in low-income and middle-income countries. In The Lancet. https://doi.org/10.1016/S0140-6736(21)02247-9

Dremin, V., Marcinkevics, Z., Zherebtsov, E., Popov, A., Grabovskis, A., Kronberga, H., Geldnere, K., Doronin, A., Meglinski, I., & Bykov, A. (2021). Skin Complications of Diabetes Mellitus Revealed by Polarized Hyperspectral Imaging and Machine Learning. IEEE Transactions on Medical Imaging. https://doi.org/10.1109/TMI.2021.3049591

El Massari, H., Mhammedi, S., Sabouri, Z., & Gherabi, N. (2022). Ontology-Based Machine Learning to Predict Diabetes Patients. Lecture Notes in Networks and Systems. https://doi.org/10.1007/978-3-030-91738-8_40

Hatua, A., Subudhi, B. N., Veerakumar, T., & Ghosh, A. (2021). Early detection of diabetic retinopathy from big data in hadoop framework. Displays. https://doi.org/10.1016/j.displa.2021.102061

Kiv, S., Heng, S., Wautelet, Y., Poelmans, S., & Kolp, M. (2022). Using an ontology for systematic practice adoption in agile methods: Expert system and practitioners-based validation. Expert Systems with Applications. https://doi.org/10.1016/j.eswa.2022.116520

Komi, M., Li, J., Zhai, Y., & Xianguo, Z. (2017). Application of data mining methods in diabetes prediction. 2017 2nd International Conference on Image, Vision and Computing, ICIVC 2017. https://doi.org/10.1109/ICIVC.2017.7984706

Krishnamoorthi, R., Joshi, S., Almarzouki, H. Z., Shukla, P. K., Rizwan, A., Kalpana, C., & Tiwari, B. (2022). A Novel Diabetes Healthcare Disease Prediction Framework Using Machine Learning Techniques. Journal of Healthcare Engineering. https://doi.org/10.1155/2022/1684017

Kumar, K. G. N., & Christopher, T. (2016). Analysis of liver and diabetes datasets by using unsupervised two-phase neural network techniques. Biomedical Research (India).

Kushwaha, J. S., Gupta, V. K., Singh, A., & Giri, R. (2022). Significant correlation between taste dysfunction and HbA1C level and blood sugar fasting level in type 2 diabetes mellitus patients in at a tertiary care centre in north India. Diabetes Epidemiology and Management, 100092.

Mandal, N., Grambergs, R., Mondal, K., Basu, S. K., Tahia, F., & Dagogo-Jack, S. (2021). Role of ceramides in the pathogenesis of diabetes mellitus and its complications. In Journal of Diabetes and its Complications. https://doi.org/10.1016/j.jdiacomp.2020.107734

Ogurtsova, K., Guariguata, L., Barengo, N. C., Ruiz, P. L. D., Sacre, J. W., Karuranga, S., Sun, H., Boyko, E. J., & Magliano, D. J. (2022). IDF diabetes Atlas: Global estimates of undiagnosed diabetes in adults for 2021. Diabetes Research and Clinical Practice. https://doi.org/10.1016/j.diabres.2021.109118

Oza, A., & Bokhare, A. (2022). Diabetes Prediction Using Logistic Regression and K-Nearest Neighbor. In Congress on Intelligent Systems, 407–418.

Parveen, S., Patre, P., & Minj, J. (2023). Various Diabetes Detection Techniques a Survey. Information and Communication Technology for Competitive Strategies (ICTCS 2021), 261–269.

PIMA Indian Diabetes Database. (n.d.). https://github.com/npradaschnor/Pima-Indians-Diabetes-Dataset/blob/master/diabetes.csv

Pranata, R., Henrina, J., Raffaello, W. M., Lawrensia, S., & Huang, I. (2021). Diabetes and COVID-19: The past, the present, and the future. In Metabolism: Clinical and Experimental. https://doi.org/10.1016/j.metabol.2021.154814

Ranjitha, R., Agalya, V., & Archana, K. (2022). Diabetes Prediction by Artificial Neural Network. Lecture Notes in Networks and Systems. https://doi.org/10.1007/978-981-16-5529-6_76

Thakkar, H., Shah, V., Yagnik, H., & Shah, M. (2021). Comparative anatomization of data mining and fuzzy logic techniques used in diabetes prognosis. Clinical EHealth. https://doi.org/10.1016/j.ceh.2020.11.001

Vijayan, V. V., & Anjali, C. (2016). Prediction and diagnosis of diabetes mellitus - A machine learning approach. 2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015. https://doi.org/10.1109/RAICS.2015.7488400

Yun, W., Zhang, X., Li, Z., Liu, H., & Han, M. (2021). Knowledge modeling: A survey of processes and techniques. International Journal of Intelligent Systems. https://doi.org/10.1002/int.22357

How to Cite
OkikiolaF. M., AdewaleO. S., & ObeO. O. (2023). A DIABETES PREDICTION CLASSIFIER MODEL USING NAIVE BAYES ALGORITHM. FUDMA JOURNAL OF SCIENCES, 7(1), 253 - 260. https://doi.org/10.33003/fjs-2023-0701-1301