MODIFIED ADAPTIVE LASSO FOR CLASSIFICATION OF HIGH DIMENSIONAL DATA

Authors

  • Emmanuel Lekwot
    Ahmadu bello University, Zaria
  • Tukur Dahiru
    Department of Community Medicine, Ahmadu Bello University, Zaria
  • Husseini Garba Dikko
    Department of Statistics, Ahmadu Bello University, Zaria
  • Enoch Yabkwa Yanshak
    Department of Statistics, Ahmadu Bello University, Zaria

Keywords:

High-dimensional data, Modified Adaptive LASSO (MALASSO), Penalized logistic regression, Gene expression analysis, Variable selection

Abstract

High-dimensional classification problems, such as gene expression analysis in medical research, require effective variable selection techniques to improve predictive accuracy and interpretability. Traditional penalized logistic regression methods, such as LASSO and Elastic Net, have been widely applied for simultaneous variable selection and coefficient estimation. However, these methods suffer from limitations, including selection bias and inefficiencies in handling correlated predictors. This study introduces the Modified Adaptive LASSO (MALASSO), a novel approach that enhances high-dimensional classification by incorporating an improved weighting mechanism based on ridge regression estimates. The new weighting scheme mitigates the selection bias observed in LASSO-based methods and improves classification performance in datasets with highly correlated features. To evaluate MALASSO’s effectiveness, extensive simulations and real-world applications were conducted using leukemia and colon cancer gene expression datasets. Results indicate that MALASSO outperforms existing methods, achieving superior classification accuracy (98.45% for leukemia and 100% for colon cancer) while selecting fewer, more relevant variables. Compared to Adaptive LASSO (ALASSO) and Adaptive Elastic Net (AEnet), MALASSO demonstrated improved robustness and model sparsity, highlighting its potential for high-dimensional medical diagnostics and biomarker discovery. This study contributes to the advancement of penalized regression techniques by addressing critical shortcomings in existing methods. Future work will explore MALASSO’s applicability to multiclass classification and other high-dimensional domains.

Author Biographies

Tukur Dahiru

Professor Dahiru Tukur

Senior Lecturer,

Department of Community Medicine,

Ahmadu Bello University,

 Zaria

Husseini Garba Dikko

Professor Husseini Garba Dikko

Senior Lecturer,

Department of Statistics,

Ahmadu Bello University,

Zaria.

Enoch Yabkwa Yanshak

Mr Enoch Yabkwa Yanshak

Ms.c Student

Ahmadu Bello University , Zaria

Dimensions

Algamal, Z.Y. and Lee, M.H. (2015). Regularized Logistic Regression with Adjusted Adaptive Elastic net for Gene Selection in High Dimensional Cancer Classification.Computers in biology and medicine, 67:136-145.

Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D. & Levine, A.J. (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences. USA 96(12): 6745-6750.

Araveeporn, A. (2021). The Higher-order of Adaptive LASSO and Elastic net Methods for Classification on High Dimensional Data. Mathematics, 9(10). 1091.

Bag, S., Gupta, K. & Deb, S. (2022). A review and recommendations on variable selection methods in regression models for binary data. arXiv preprint arXiv:2201.06063.

Buhlmann, P. and Van De Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer Science & Business Media, Heidelberg.

Fan, J. and Lv, J. (2008). Sure Independence Screening for Ultrahigh Dimensional Feature Space. Journal of the Royal Statistical Society B, 70(5): 849-911.

Farhadi, Z., Belaghi, R.A. & Alma, O.G. (2019). Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data. American Journal of Theoretical and Applied Statistics, 8(5): 185-192.

Golub T. R., Slonim, D. K., Tamayo, P. , Huard, C. M., Mesirov, J. P., Coller, H., & Loh,

M.L. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring, Science. The Annals of Statistics, 286: 531-537.

Greenwood, C.J., Youssef, G.J., Letcher P, Macdonald, J.A., Hagg, L.J., & Sanson, A. (2020). A Comparison of penalized Regression Methods for Informing the Selection of Predictive Markers. PLoS ONE 15(11): e0242730. https://doi.org/10.1371/journal.pone.0242.

Hastie, T., Tibshirani, R. & Friedman, J. (2001). The Elements of Statistical Learning; Data Mining, Inference and Prediction. New York, Springer.

Hastie, T., Tibshirani, R. & Friedman, J. (2017). The Elements of Statistical Learning; Data Mining, Inference and Prediction. New York, Springer.

Hosmer, D.W. and Lemeshow, S. (2000). Applied Logistic Regression. 2nd Edition,Wiley, New York.

Ismah, K., Anwar, N. & Bagus, S. (2021). A Multicollinearity-Adjusted Adaptive LASSO for Zero- Infated Count Regression with Weight of Expectation Maximiza- tion Standard Error Adaptive LASSO for Zero Inflated Poisson Data. Journal of Physics, Conference Series. 1776 012050.

Muhammad, A. B., Olawoyin, I. O., Yahaya, A., Gulumbe, S. U., Muhammad, A. A., & Salisu, I. A. (2024). Credit Risk Analysis: An Assessment of the Performance of Six Machine Learning Techniques in Credit Scoring Modelling. FUDMA Journal of Sciences, 8(6), 163-173.

Qian, W. and Yang, Y. (2013). Model Selection via Standard Error Adjusted Adaptive LASSO. Ann Inst Stat Math, 65:295-318.

Tibshirani, R. (1996). Regression Shrinkage and Selection via the LASSO. Journal of the Royal Statistical Society. Series B(Methodological) : 267-288.

Tibshirani, R. , Hastie, T. & Wainwright, M. (2019). Statistical Learning with Sparsity-The LASSO and Generalizations .Chapman and hall book.

Wahid, A. (2022). Adaptive LASSO in High-dimensions. https://doi.org/10.31235/osf.io/yphxv.

Wu, Y. (2021). Cant Ridge Regression Perform Variable Selection?. Technometrics, 63(2):263-271.

Zou, H. (2006). The Adaptive LASSO and its Oracle Properties. Journal of the American Statistical Association, 101: 1418-1429.

Zou. H, and Hastie. T. (2005). Regularization and Variable Selection via the Elastic net,Journal of the Royal Statistical Society.B 67 :301-320.

Zou, H. (2006). The Adaptive LASSO and its Oracle Properties. Journal of the American Statistical Association, 101: 1418-1429.

Zou, H. and Zhang, T. (2009). On the Adaptive Elastic net with a Diverging Number of Parameters. Annals of Statistics. 37:1733-1751.

Published

17-04-2025

How to Cite

MODIFIED ADAPTIVE LASSO FOR CLASSIFICATION OF HIGH DIMENSIONAL DATA. (2025). FUDMA JOURNAL OF SCIENCES, 9(2), 290-297. https://doi.org/10.33003/fjs-2025-0902-3248

How to Cite

MODIFIED ADAPTIVE LASSO FOR CLASSIFICATION OF HIGH DIMENSIONAL DATA. (2025). FUDMA JOURNAL OF SCIENCES, 9(2), 290-297. https://doi.org/10.33003/fjs-2025-0902-3248