CARDIOVASCULAR DISEASE PREDICTION USING RANDOM FOREST MACHINE LEARNING ALGORITHM

  • Aminu Bashir Suleiman Federal University Dutsin Ma, Katsina
  • Stephen Luka
  • Muhammad Ibrahim
Keywords: Cardiovascular disease, K nearest neighbor, Machine learning, Support Vector, Logistic Regression, Naïve Bayes

Abstract

Every year, cardiovascular disease (CVD) claims the lives of nearly 17 million people worldwide. Predicting heart disease early and accurately can help delay therapies and improve results. Patient data analysis machine learning techniques have shown promise for better predictive capabilities than conventional methods; however, there are still gaps in areas such as algorithm blending, standardization, feature optimization, and model tuning that require strong methodology. By benchmarking against established methods, this study attempts to create a more sophisticated machine learning model with detailed performance and a robust approach for predicting heart disease. Using a clinical dataset that was obtained from an internet repository, an improved random forest (RF) model was created. It was then tested against baseline logistic regression and support vector machine models, Naïve Bayes Classifier, K Nearest Neighbors Classifier, and Decision Tree Classifier. RF hyperparameter tweaking, redundant feature filtering, and systematic data preprocessing were used. Accuracy, precision, recall, F1 score, and ROC analysis were computed as evaluation measures. With F1 score, 1.00 AUC, and 90% accuracy, The RF model demonstrated superior performance compared to the remaining models, which exhibited, AUCs of 0.9, 0.82, and 0.9. On the public dataset, the refined RF model demonstrated exceptional predictive performance, highlighting the promise of a methodical machine learning approach to improve heart disease prediction. The external clinical validation and optimization of various patient populations should be the main areas of attention for future research.

References

Acharya, U. R., Oh, S. L., Hagiwara, Y., Tan, J. H., Adam, M., Gertych, A., & San Tan, R. (2020). A deep convolutional neural network model to classify heartbeats. Computers in biology and medicine. 117.

Brown, A. G., & Lee, H. (2022). Predicting heart disease risk using machine learning and wearable technology. Journal of Biomedical Informatics, 145.

Chintan, M., & Bhatt. (2016). Heart disease prediction using machine learning and data mining: A review. . I.J. Healthcare and Medical Sciences, 7-26.

Jabbar, M., A., S., & Tareeq, A. (2016). Understanding of a Convolutional Neural Network. international conference of engineering and technology.

Johnson, R., & Williams, D. ( 2019). Incorporating genetics and personalized medicine in machine learning approaches for heart disease predictions. Precision Medicine, 12-26.

Jones, D., Zhang, A., & Peterson, M. (2022). Gaps in standardized evaluation practices for machine learning based disease prediction: A review of challenges and opportunities. JAMA Network Open, 5.

Lee, C. K., & Johnson, R. (2021). Feature selection techniques for machine learning based disease prediction: Influence on model performance. . Applied Informatics, 8.

Madhumita, P., & Parija, S. (2021). Heart disease prediction using random forest. . Int. J. Engineering Research and Applications, 1-4.

Pal, R., & Smita. ( 2020). Heart disease prediction using machine learning techniques: A survey. . Int. J. Engineering Research and Applications, 20-24.

Patel, N., Shen, J., & Zhang, A. (2019). Predicting heart disease using machine learning techniques. . Proc IEEE Int Conf Bioinformatics Biomed.

Patil, P., & Rani, R. (2022). Hyperparameter tuning for improved performance of machine learning models in disease prediction. IEEE Access.

Perez, M. V., Mahaffey, K. W., Hedlin, H., Rumsfeld, J. S., Garcia, A., Ferris, T., & Lee, J. M. (2019). Large-scale assessment of a smartwatch to identify atrial fibrillation. . New England Journal of Medicine, 1909-1917. DOI: https://doi.org/10.1056/NEJMoa1901183

Smith, A. J., Jones, D. W., & Brown, S. M. (2020). Machine learning approaches for predicting heart disease: Utilization of electronic health record data. Computational and Structural Biotechnology Journal, 2710–2717.

Wang, Z., Huang, Y., Huang, B., Xie, D., & Zhang, S. (2023). A personalized Heart Disease Prediction Approach via EHR-driven Deep Learning Model Ensemble. . IEEE Journal of Biomedical and Health Informatics, 248–258. .

Yurii, K., Roy, S., Dey, S., & Chatterjee, S. (2022). Autocorrelation Aided Random Forest Classifier Based Bearing Fault Detection Framework. . IEEE Sensors Journal.

Zhang, A., Patel, N., Datta, S., Wang, B., & Zhang, S. (2020). Predicting potential heart disease using machine learning techniques. Frontiers in Public Health Science.

Published
2023-12-31
How to Cite
Bashir SuleimanA., Luka S., & Ibrahim M. (2023). CARDIOVASCULAR DISEASE PREDICTION USING RANDOM FOREST MACHINE LEARNING ALGORITHM. FUDMA JOURNAL OF SCIENCES, 7(6), 282 - 289. https://doi.org/10.33003/fjs-2023-0706-2128