LEVERAGING MACHINE LEARNING MODELS FOR PREDICTING THE LIKELIHOOD OF POLYCYSTIC OVARIAN SYNDROME IN WOMEN OF REPRODUCTIVE AGE

Authors

  • Joseph O. Okhuoya
  • Festus O. Oliha
  • K. M. Martins

DOI:

https://doi.org/10.33003/fjs-2025-0901-3088

Keywords:

Diagnosis, Machine Learning, Machine Learning Models, PCOS

Abstract

Conventional diagnostic approaches for polycystic ovarian syndrome (PCOS – a condition characterized by heterogeneity and the absence of a singular diagnostic test) are often invasive, time-consuming, and rely on varying criteria, resulting in inconsistencies in diagnosis. This study addresses the pressing challenge of improving the diagnosis of PCOS by exploring machine learning applications to bridge gaps in its prediction and diagnosis, offering a potential pathway toward greater accuracy and efficiency. The Cross-Industry Standard Process for Data Mining methodology was adopted for implementation using a comprehensive dataset from a public library – Kaggle. Results identified XGBoost algorithm as the most effective predictive model for diagnosing and predicting PCOS, achieving an accuracy of 98.7%. The results of the study indicated that the XGBoost algorithm is reliable with strong accuracy and dependability in diagnosing PCOS, establishing the PCOS Predictor as a valuable tool in clinical environments.  This study thus represents a significant step forward in transforming the diagnostic landscape of PCOS, combining technological advancements with clinical insights to enhance women's healthcare.

References

Atajeromavwo Edafe John, Okiemute Dickson Ofuyekpone, and Rume Elizabeth Yoro (2024). Estimation of Oil Spillage and Salvage Revenue in Kokori Oil Field using Numerical methods and Python Algorithm. FUDMA Journal of Sciences (fjs) vol 8(5).

Box G. E. P and Jenkins M, (1976) Time Series Analysis: Forecasting and Control, Holden-Day, San Francisco.

Brockwell P.J and Davis R.A (2002). Introduction to Time Series and Forecasting 2nd Edition, Springer New York.

Chatfield C. (2004), The Analysis of Time Series: An Introduction, Chapman and Hall/CRC Press, Boca Raton.

Iwueze I. S, Nwogu E.C, Ohakwe J. and Ajaraogu J.C (2011), Uses of the Buys-Ballot Table in Time Series Analysis, Applied Mathematics, pg 633-645.

Iwueze I.S and Nwogu E.C, (2014) Framework For Choice of Model and Detection of Seasonal Effect in Time Series Far East Journal of Theoretical Statistics, Vol. 48 No. 1, pp 45-66.

Hao Wang, Lubna AI Tarawneh, Changqing Cheng, and Yu Jin (2024). A decomposition-guided mechanism for non-stationary time series forecasting. AIP Advances 14, 015254. https://doi.org/10.1063/5.0153647 .

Jan Banas and Anna Kozuch (2019). The Application of Time Series Decomposition for the Identification and Analysis of fluctuation in Timber supply and price. Forest 10(11) 990; https://doi.org/10.3390/f10110990 .

Justo P. and Rivera M.A.,(2010) Descriptive analysis of Time Series applied to housing prices in Spain, Management Mathematics for European Schools, 94342-CP-1-2001-DE-COMENIUS-C21

Mood, A., Graybill, F. and Boes, D. (1974) Introduction to the Theory of Statistics. 3rd Edition, McGraw-Hill, New York, 122-123.

Spiegel, M.R. (1975) Probability and Statistics. McGraw-Hill, New York, 372.

Wei W.W.W, (1989): Time Series Analysis, Univariate and Multivariate Methods, Addison- Wesley, Redwood City.

Published

2025-01-31

How to Cite

Okhuoya, J. O., Oliha, F. O., & Martins, K. M. (2025). LEVERAGING MACHINE LEARNING MODELS FOR PREDICTING THE LIKELIHOOD OF POLYCYSTIC OVARIAN SYNDROME IN WOMEN OF REPRODUCTIVE AGE. FUDMA JOURNAL OF SCIENCES, 9(1), 323 - 332. https://doi.org/10.33003/fjs-2025-0901-3088