AN IMPROVED HEART DISEASE PREDICTION USING INFORMATION GAIN-BASED FEATURE SELECTION

Authors

  • Umar Murtala Mani Alqalam University Katsina
  • Zahraddeen Sufyanu Federal University Dutse image/svg+xml
  • Usman Mahmud Northwest University Kano
  • Usman Umar Federal University Kashere image/svg+xml
  • Surayya Tajoudden Bashir Murtala Muhammad Specialist Hospital, Kano

DOI:

https://doi.org/10.33003/fjs-2025-0912-4267

Keywords:

Heart Disease Prediction, Cardiovascular Disease, Information Gain, Feature Selection, Machine Learning, Clinical Data Analysis, Kaggle Dataset

Abstract

Heart disease remains one of the leading causes of mortality worldwide, accounting for a significant proportion of deaths annually. Early and accurate prediction of heart disease risk is therefore essential for guiding timely clinical intervention and reducing healthcare burdens. However, predictive models often suffer from reduced performance due to redundant and irrelevant features present in medical datasets. This study addresses this challenge by applying Information Gain-based feature selection to improve the reliability of heart disease prediction. The research utilized the Kaggle Heart Disease Dataset, which consists of demographic and clinical attributes including age, sex, chest pain type, resting blood pressure, cholesterol level, exercise-induced angina, and ST-slope characteristics. Information Gain, an entropy-based ranking criterion, was employed to identify and retain the most informative features while discarding less relevant variables. By reducing dimensionality, the approach enhanced both model interpretability and computational efficiency. Experimental evaluation demonstrated that models trained on the Information Gain-selected features achieved higher accuracy and better generalization compared to models trained on the full dataset. The feature selection process also highlighted the clinical risk factors most strongly associated with heart disease, such as chest pain type, ST-slope, and exercise-induced angina. The results confirm that Information Gain-based feature selection significantly improves predictive performance and provides valuable insights into the attributes most indicative of heart disease risk. This approach contributes to the development of lightweight, interpretable, and effective predictive systems that can support clinical decision-making and early diagnosis.

References

Abbasi, M., et al. (2025). Early diagnosis of cardiac disorders using ML decision support system. BMC Medical Informatics and Decision Making.

Ahmad, M., et al. (2025). Feature-selection strategies for optimized heart-disease diagnosis. Computational Intelligence and Neuroscience.

Alsabhan, A., & Alfadhly, S. (2025). Effectiveness of ML models in heart disease diagnosis. Frontiers in Digital Health.

Ashika, T., & Grace, G. H. (2025). Enhancing heart-disease prediction with stacked ensemble and MCDM-based ranking: An optimized RST-ML approach. Frontiers in Digital Health, 3, Article 1609308. https://doi.org/10.3389/fdgth.2025.1609308

Başar, R., Uçar, T., & Demir, F. (2025). Leveraging machine-learning techniques to predict heart disease: An evaluation of clinical attributes and explainability. Information, 16(8), Article 639. https://doi.org/10.3390/info16080639

Dey, D., Slomka, P. J., Leeson, P., Comaniciu, D., Shrestha, S., Sengupta, P. P., & Marwick, T. H. (2022). Artificial intelligence in cardiovascular imaging: JACC state-of-the-art review. Journal of the American College of Cardiology, 79(25), 2519–2536. https://doi.org/10.1016/j.jacc.2022.04.033

García-Ordás, J., et al. (2024). Deep learning with feature augmentation for heart-disease prediction. BMC Medical Informatics and Decision Making.

Karmakar, P., et al. (2024). A data-balancing approach for expert-system design. Artificial Intelligence in Medicine.

Kumar, A., et al. (2025). A hybrid framework for heart disease prediction using classical and quantum-inspired machine learning techniques. Scientific Reports, 15, Article 25040. https://doi.org/10.1038/s41598-025-09957-1

Rehman, M. U., Naseem, S., Butt, A. U. R., Mahmood, T., & Khan, A. (2025). Predicting coronary heart disease with advanced machine learning classifiers for improved cardiovascular risk assessment. Scientific Reports, 15, Article 13361. https://doi.org/10.1038/s41598-025-13361-2

Shesharao, S., et al. (2024). Advancements in machine learning for heart disease prediction. Computers in Biology and Medicine.

Teja, V., et al. (2025). Optimizing diagnosis of heart disease with advanced machine learning. Scientific Reports.

Wan, S. (2025). Machine-learning approaches for cardiovascular disease: Perspectives on rigorous validation. Artificial Intelligence in Medicine, 150, Article 102500. https://doi.org/10.1016/j.artmed.2025.102500

World Health Organization. (2023). Cardiovascular diseases (CVDs): Key facts. https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds)

Zhang, X., et al. (2022). Machine-learning models for hypertension and cardiovascular risk prediction in China. BMC Cardiovascular Disorders, 22, 305. https://doi.org/10.1186/s12872-022-02789-5

Model Framework

Downloads

Published

31-12-2025

How to Cite

Mani, U. M., Sufyanu, Z., Mahmud, U., Umar, U., & Bashir, S. T. (2025). AN IMPROVED HEART DISEASE PREDICTION USING INFORMATION GAIN-BASED FEATURE SELECTION. FUDMA JOURNAL OF SCIENCES, 9(12), 599-603. https://doi.org/10.33003/fjs-2025-0912-4267

Most read articles by the same author(s)