EFFECTS OF DATA DELETION AND WEIGHTING ON FISHER’S LINEAR CLASSIFICATION METHOD: A ROBUSTIFICATION APPROACH
Abstract
The classical supervised classification model's performance is hampered by the effects of influential observations (IOs). The influential observations(IO’s) when deleted, weighted,Winsorized, truncated and retained have enormous effects in making replicative inferences in different classification models. Due to the influence of IO’s on classical supervised classification models, different methods such as IO deletion or weighting have been introduced to reduce the influence of IO’s. Some of these influential observations reduction or deletion methods have resulted in information loss of various degrees. In this study, we investigated the effects of IOs deletion and weighting using the Mahalanobis distance as a plug in to enhance the robustness of the Fisher linear classification method (FLCM). We proposed an F-weight plug in method to robustify the FLCM. We compared the performance of these methods to determine whether IO deletion or IO weighting retards or enhances the classification accuracy of the FLCM. The study affirmed that IO weighting using the F-weight minimizes information loss more than the IO deletion using the Mahalanobis distance. This study concludes that the variant of FLCM based on the F-weight method showed improved classification accuracy, and efficiency more than the Mahalanobis distance based FLCM.
References
Abid, F., &Izeboudjen, N. (2020). Predicting Forest Fire in Algeria Using Data Mining Techniques: Case Study of the Decision Tree Algorithm. Advances in Intelligent Systems and Computing, 1105 AISC. https://doi.org/10.1007/978-3-030-36674-2_37 DOI: https://doi.org/10.1007/978-3-030-36674-2_37
Croux, C., & Dehon, C. (2001). Robust linear discriminant analysis using S-estimators. Canadian Journal of Statistics, 29(3). https://doi.org/10.2307/3316042 DOI: https://doi.org/10.2307/3316042
Ghosh, A., SahaRay, R., Chakrabarty, S., & Bhadra, S. (2021). Robust generalised quadratic discriminant analysis. Pattern Recognition, 117. https://doi.org/10.1016/j.patcog.2021.107981 DOI: https://doi.org/10.1016/j.patcog.2021.107981
He, X., & Fung, W. K. (2000). High Breakdown Estimation for Multiple Populations with Applications to Discriminant Analysis. Journal of Multivariate Analysis, 72(2). https://doi.org/10.1006/jmva.1999.1857 DOI: https://doi.org/10.1006/jmva.1999.1857
Apanapudor, JS; Umukoro, J; Okwonu, FZ; Okposo, N,(2023), Optimal solution techniques for control problem of evolution equations, Science World Journal,18(3): 503-508 DOI: https://doi.org/10.4314/swj.v18i3.27
Hubert, M., & Debruyne, M. (2009). Breakdown value. Wiley Interdisciplinary Reviews: Computational Statistics, 1(3). https://doi.org/10.1002/wics.34 DOI: https://doi.org/10.1002/wics.34
Okwonu, F. Z; Ahad, N. A.; Okoloko, I. E.; Apanapudor, J. S.; Kamaruddin, S. A; Arunaye, F. I. (2022). Robust hybrid classification methods and applications, Pertanika J. Sci. & Technol. 30 (4): 2831 - 2850 DOI: https://doi.org/10.47836/pjst.30.4.29
Apanapudor, J. S; Aderibigbe, FM; Okwonu, FZ, (2020). An optimal penalty constant for discrete optimal control regulator problems, Journal of Physics: Conference Series, 1529(4): 042073 DOI: https://doi.org/10.1088/1742-6596/1529/4/042073
Okwonu, F. Z.(2024), Nonpharmaceutical And Pharmaceutical Covid-19 Prediction Models, FUDMA Journal Of Sciences,Vol.8(3): 309-313. DOI: https://doi.org/10.33003/fjs-2024-0803-2551
Hubert, M., & Debruyne, M. (2010). Minimum covariance determinant. WIREs Computational Statistics, 2(1), 3643. https://doi.org/10.1002/wics.61 DOI: https://doi.org/10.1002/wics.61
Hubert, M., Rousseeuw, P. J., & van Aelst, S. (2008). High-breakdown robust multivariate methods. Statistical Science, 23(1). https://doi.org/10.1214/088342307000000087 DOI: https://doi.org/10.1214/088342307000000087
Jennison, C., Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., & Stahel, W. A. (1987). Robust Statistics: The Approach Based on Influence Functions. Journal of the Royal Statistical Society. Series A (General), 150(3). https://doi.org/10.2307/2981480 DOI: https://doi.org/10.2307/2981480
Johnson, R. A., & Wichern, D. W. (1992). Applied Multivariate Statistical Analysis (3rd ed.). Prentice-Hall, Inc, Englewood Cliffs.
Okwonu, F. Z.; Ahad, N.A., Apanapudor J.S.; Arunaye I.F.(2021). Robust Multivariate Correlation Techniques: A Confirmation Analysis using Covid-19 Data Set. Pertanika J. Sci. & Technol. 29 (2): 999 - 1015 (2021). DOI: https://doi.org/10.47836/pjst.29.2.16 DOI: https://doi.org/10.47836/pjst.29.2.16
Law, J., Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., & Stahel, W. A. (1986). Robust Statistics-The Approach Based on Influence Functions. The Statistician, 35(5). https://doi.org/10.2307/2987975 DOI: https://doi.org/10.2307/2987975
Lim, Y. F., Yahaya, S. S. S., & Ali, H. (2018). Robust linear discriminant analysis with highest breakdown point estimator. Journal of Telecommunication, Electronic and Computer Engineering, 10(111).
Okwonu, F. Z.; Ahad, N.A. Apanapudor, J. S. Arunaye, F. I.(2020). Covid-19 Prediction Model (COVID-19-PM) For Social Distancing: The Height Perspective: COVID-19 Prediction Model for Social Distancing: The Height Perspective. Proceedings of the Pakistan Academy of Sciences: A. Physical and Computational Sciences,57(4): 93-98.
Maechler, M., Rousseeuw, P., Croux, C., Todorov, V., Ruckstuhl, A., Salibian-Barrera, M., Verbeke, T., Koller, M., Conceicao, E., & di Palma, M. A. (2021). robustbase: Basic Robust Statistics R package version 0.93-7. In cran.stat.upd.edu.ph.
Maronna, R. A., Martin, R. D., &Yohai, V. J. (2006). Robust Statistics: Theory and Methods. In Robust Statistics: Theory and Methods. https://doi.org/10.1002/0470010940 DOI: https://doi.org/10.1002/0470010940
Okwonu, F.Z. and Othman, A.R.(2013). Comparative performance of classical Fisher linear discriminant analysis and robust Fisher linear discriminant analysis, Matematika,29: 213-220,https://doi.org/10.11113/matematika.v29.n.594
Nursalam, 2016, metode penelitian, Maronna, R. A., Martin, R. D., Yohai, V. J., & Salibin-Barrera, M. (2019). Robust Statistics Theory and Methods (with R) Second. In Wiley Series in Probability and Statistics (Vol. 53, Issue 9).
Okwonu, F. Z. (2012). Several Robust Techniques in Two-Groups Unbiased Linear Classification. https://core.ac.uk/download/pdf/199245931.pdf.
Okwonu, F. Z., Ahad, N. A., Ogini, N. O., Okoloko.I.E., & Wan, W. Z. (2022). Comparative Performance Evaluation of Efficiency for High Dimensional Classification Methods. Journal Of Information and Communication Technology, Vol.21(3): 437-464 DOI: https://doi.org/10.32890/jict2022.21.3.6
Qin, X., Wang, S., Chen, B., & Zhang, K. (2020). Robust Fisher Linear Discriminant Analysis with Generalized Correntropic Loss Function. Proceedings - 2020 Chinese Automation Congress, CAC 2020. https://doi.org/10.1109/CAC51589.2020.9326644. DOI: https://doi.org/10.1109/CAC51589.2020.9326644
Okwonu, F.Z. Dieng, H.; Othman, A. R.; Hui, O. S.(2012), Classification of aedes adults mosquitoes in two distinct groups based on fisher linear discriminant analysis and FZOARO techniques, Mathematical Theory and Modeling,2(6): 22-30
Rousseeuw, P. J., & Hubert, M. (2018). Anomaly detection by robust statistics. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(2). https://doi.org/10.1002/widm.1236 DOI: https://doi.org/10.1002/widm.1236
Okwonu, F. Z. Ahad, N. A.Hamid, H.; Muda, N. & Olimjon, S.S. (2023). Enhanced robust univariate classification methods for solving outliers and overfitting problems, Journal of Information and Communication TechnologyVol.22(1):1-30. DOI: https://doi.org/10.32890/jict2023.22.1.1
Seheult, A. H., Green, P. J., Rousseeuw, P. J., & Leroy, A. M. (1989). Robust Regression and Outlier Detection. Journal of the Royal Statistical Society. Series A (Statistics in Society), 152(1). https://doi.org/10.2307/2982847 DOI: https://doi.org/10.2307/2982847
Okwonu, F. Z.; Othman, A. R.(2013): Heteroscedastic variancecovariance matrices for unbiased two groups linear classification methods, Applied Mathematical Sciences,7(138): 6855-6865. http://dx.doi.org/10.12988/ams.2013.39486 DOI: https://doi.org/10.12988/ams.2013.39486
Tyler, D. E. (2008). Robust Statistics: Theory and Methods. Journal of the American Statistical Association, 103(482). https://doi.org/10.1198/jasa.2008.s239 DOI: https://doi.org/10.1198/jasa.2008.s239
Wang, H., Lu, X., Hu, Z., & Zheng, W. (2014). Fisher discriminant analysis with L1-norm. IEEE Transactions on Cybernetics, 44(6). https://doi.org/10.1109/TCYB.2013.2273355 DOI: https://doi.org/10.1109/TCYB.2013.2273355
Copyright (c) 2025 FUDMA JOURNAL OF SCIENCES

This work is licensed under a Creative Commons Attribution 4.0 International License.
FUDMA Journal of Sciences