MODIFIED ADAPTIVE LASSO FOR CLASSIFICATION OF HIGH DIMENSIONAL DATA
Abstract
High-dimensional classification problems, such as gene expression analysis in medical research, require effective variable selection techniques to improve predictive accuracy and interpretability. Traditional penalized logistic regression methods, such as LASSO and Elastic Net, have been widely applied for simultaneous variable selection and coefficient estimation. However, these methods suffer from limitations, including selection bias and inefficiencies in handling correlated predictors. This study introduces the Modified Adaptive LASSO (MALASSO), a novel approach that enhances high-dimensional classification by incorporating an improved weighting mechanism based on ridge regression estimates. The new weighting scheme mitigates the selection bias observed in LASSO-based methods and improves classification performance in datasets with highly correlated features. To evaluate MALASSO’s effectiveness, extensive simulations and real-world applications were conducted using leukemia and colon cancer gene expression datasets. Results indicate that MALASSO outperforms existing methods, achieving superior classification accuracy (98.45% for leukemia and 100% for colon cancer) while selecting fewer, more relevant variables. Compared to Adaptive LASSO (ALASSO) and Adaptive Elastic Net (AEnet), MALASSO demonstrated improved robustness and model sparsity, highlighting its potential for high-dimensional medical diagnostics and biomarker discovery. This study contributes to the advancement of penalized regression techniques by addressing critical shortcomings in existing methods. Future work will explore MALASSO’s applicability to multiclass classification and other high-dimensional domains.
References
Algamal, Z.Y. and Lee, M.H. (2015). Regularized Logistic Regression with Adjusted Adaptive Elastic net for Gene Selection in High Dimensional Cancer Classification.Computers in biology and medicine, 67:136-145. DOI: https://doi.org/10.1016/j.compbiomed.2015.10.008
Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D. & Levine, A.J. (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences. USA 96(12): 6745-6750. DOI: https://doi.org/10.1073/pnas.96.12.6745
Araveeporn, A. (2021). The Higher-order of Adaptive LASSO and Elastic net Methods for Classification on High Dimensional Data. Mathematics, 9(10). 1091. DOI: https://doi.org/10.3390/math9101091
Bag, S., Gupta, K. & Deb, S. (2022). A review and recommendations on variable selection methods in regression models for binary data. arXiv preprint arXiv:2201.06063.
Buhlmann, P. and Van De Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer Science & Business Media, Heidelberg. DOI: https://doi.org/10.1007/978-3-642-20192-9
Fan, J. and Lv, J. (2008). Sure Independence Screening for Ultrahigh Dimensional Feature Space. Journal of the Royal Statistical Society B, 70(5): 849-911. DOI: https://doi.org/10.1111/j.1467-9868.2008.00674.x
Farhadi, Z., Belaghi, R.A. & Alma, O.G. (2019). Analysis of Penalized Regression Methods in a Simple Linear Model on the High-Dimensional Data. American Journal of Theoretical and Applied Statistics, 8(5): 185-192. DOI: https://doi.org/10.11648/j.ajtas.20190805.14
Golub T. R., Slonim, D. K., Tamayo, P. , Huard, C. M., Mesirov, J. P., Coller, H., & Loh,
M.L. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring, Science. The Annals of Statistics, 286: 531-537. DOI: https://doi.org/10.1126/science.286.5439.531
Greenwood, C.J., Youssef, G.J., Letcher P, Macdonald, J.A., Hagg, L.J., & Sanson, A. (2020). A Comparison of penalized Regression Methods for Informing the Selection of Predictive Markers. PLoS ONE 15(11): e0242730. https://doi.org/10.1371/journal.pone.0242. DOI: https://doi.org/10.1371/journal.pone.0242730
Hastie, T., Tibshirani, R. & Friedman, J. (2001). The Elements of Statistical Learning; Data Mining, Inference and Prediction. New York, Springer. DOI: https://doi.org/10.1007/978-0-387-21606-5
Hastie, T., Tibshirani, R. & Friedman, J. (2017). The Elements of Statistical Learning; Data Mining, Inference and Prediction. New York, Springer.
Hosmer, D.W. and Lemeshow, S. (2000). Applied Logistic Regression. 2nd Edition,Wiley, New York. DOI: https://doi.org/10.1002/0471722146
Ismah, K., Anwar, N. & Bagus, S. (2021). A Multicollinearity-Adjusted Adaptive LASSO for Zero- Infated Count Regression with Weight of Expectation Maximiza- tion Standard Error Adaptive LASSO for Zero Inflated Poisson Data. Journal of Physics, Conference Series. 1776 012050. DOI: https://doi.org/10.1088/1742-6596/1776/1/012050
Muhammad, A. B., Olawoyin, I. O., Yahaya, A., Gulumbe, S. U., Muhammad, A. A., & Salisu, I. A. (2024). Credit Risk Analysis: An Assessment of the Performance of Six Machine Learning Techniques in Credit Scoring Modelling. FUDMA Journal of Sciences, 8(6), 163-173. DOI: https://doi.org/10.33003/fjs-2024-0806-2893
Qian, W. and Yang, Y. (2013). Model Selection via Standard Error Adjusted Adaptive LASSO. Ann Inst Stat Math, 65:295-318. DOI: https://doi.org/10.1007/s10463-012-0370-0
Tibshirani, R. (1996). Regression Shrinkage and Selection via the LASSO. Journal of the Royal Statistical Society. Series B(Methodological) : 267-288. DOI: https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
Tibshirani, R. , Hastie, T. & Wainwright, M. (2019). Statistical Learning with Sparsity-The LASSO and Generalizations .Chapman and hall book.
Wahid, A. (2022). Adaptive LASSO in High-dimensions. https://doi.org/10.31235/osf.io/yphxv. DOI: https://doi.org/10.31235/osf.io/yphxv
Wu, Y. (2021). Cant Ridge Regression Perform Variable Selection?. Technometrics, 63(2):263-271. DOI: https://doi.org/10.1080/00401706.2020.1791254
Zou, H. (2006). The Adaptive LASSO and its Oracle Properties. Journal of the American Statistical Association, 101: 1418-1429.
Zou. H, and Hastie. T. (2005). Regularization and Variable Selection via the Elastic net,Journal of the Royal Statistical Society.B 67 :301-320. DOI: https://doi.org/10.1111/j.1467-9868.2005.00503.x
Zou, H. (2006). The Adaptive LASSO and its Oracle Properties. Journal of the American Statistical Association, 101: 1418-1429. DOI: https://doi.org/10.1198/016214506000000735
Zou, H. and Zhang, T. (2009). On the Adaptive Elastic net with a Diverging Number of Parameters. Annals of Statistics. 37:1733-1751. DOI: https://doi.org/10.1214/08-AOS625
Copyright (c) 2025 FUDMA JOURNAL OF SCIENCES

This work is licensed under a Creative Commons Attribution 4.0 International License.
FUDMA Journal of Sciences