REGRESSION ESTIMATION AND FEATURE SELECTION USING MODIFIED CORRELATION-ADJUSTED ELASTIC NET PENALTIES
Abstract
Regularized regression techniques such as the least absolute shrinkage and selection operator (LASSO), elastic-net, and the type 1 and type 2 correlation adjusted elastic-net (CAEN1 and CAEN2 respectively) are used for simultaneously carrying out variable selection and estimation of coefficients in machine learning. Modified estimators based on the CAEN1 and CAEN2 are proposed in this study by rescaling the estimates to undo the double shrinkage incurred due to the application of two penalties. The scale factors are derived by decomposing the correlation matrix of the predictors. The derived scale factors, which depend on the magnitude of correlations among the predictors, ensure that the elastic-net is included as a special case. Estimation is carried out using a robust worst-case quadratic solver algorithm. Simulations show that the proposed estimators referred to as corrected correlation adjusted elastic-net (CCAEN1 and CCAEN2) perform competitively with the CAEN1, CAEN2, LASSO, and elastic-net in terms of variable selection, estimation and prediction accuracy with CCAEN1 yielding the best results when the number of predictors is more than the number of observations and CCAEN2 producing the best performance when there is grouping effect, where highly correlated predictors tend to be included in or excluded from the model together. Applications to two real-life datasets further demonstrate the advantage of the proposed methods for machine learning.
References
Algamal, Z. Y. (2015). Penalized poisson regression model using adaptive modified elastic net penalty. Electronic Journal of Applied Statistical Analysis, 8(2), 236-245.
Anbari, M. E., & Mkhadri, A. (2014). Penalized regression combining the L 1 norm and a correlation based penalty. Sankhya B, 76, 82-102. DOI: https://doi.org/10.1007/s13571-013-0065-4
Babarinsa, O., Sofi, A. Z. M., Mohd, A. H., Eluwole, A., Sunday, I., Adamu, W., Daniel, L. (2022). Note on the history of (square) matrix and determinant. FUDMA JOURNAL OF SCIENCES, 6(3), 177-190. DOI: https://doi.org/10.33003/fjs-2022-0603-775
Beck, A., & Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences, 2(1), 183-202. DOI: https://doi.org/10.1137/080716542
Biecek, P., & Burzykowski, T. (2021). Explanatory model analysis: explore, explain, and examine predictive models: Chapman and Hall/CRC. DOI: https://doi.org/10.1201/9780429027192
Bondell, H. D., & Reich, B. J. (2006). Simultaneous regression shrinkage, variable selection and clustering of predictors with OSCAR. Retrieved from
Breiman, L. (1996). Heuristics of instability and stabilization in model selection. The annals of statistics, 24(6), 2350-2383. DOI: https://doi.org/10.1214/aos/1032181158
Efron, B., & Tibshirani, R. (1997). Improvements on cross-validation: the 632+ bootstrap method. Journal of the American statistical Association, 92(438), 548-560. DOI: https://doi.org/10.1080/01621459.1997.10474007
Efron, B., & Tibshirani, R. J. (1994). An introduction to the bootstrap: Chapman and Hall/CRC. DOI: https://doi.org/10.1201/9780429246593
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American statistical Association, 96(456), 1348-1360. DOI: https://doi.org/10.1198/016214501753382273
Fan, J., & Li, R. (2006). Statistical challenges with high dimensionality: Feature selection in knowledge discovery. arXiv preprint math/0602133.
Fu, W. J. (1998). Penalized regressions: the bridge versus the lasso. Journal of computational and graphical statistics, 7(3), 397-416. DOI: https://doi.org/10.1080/10618600.1998.10474784
Garba, W., Yahya, G., & Aremu, M. (2016). Multiclass Sequential Feature Selection and Classification Method for Genomic Data. Blood, 7(10).
Grandvalet, Y., Chiquet, J., & Ambroise, C. (2012). Sparsity by Worst-Case Penalties. arXiv preprint arXiv:1210.2077.
Hanke, M., Dijkstra, L., Foraita, R., & Didelez, V. (2024). Variable selection in linear regression models: Choosing the best subset is not always the best choice. Biometrical Journal, 66(1), 2200209. DOI: https://doi.org/10.1002/bimj.202200209
Hapfelmeier, A., Babatunde, W., Yahya, R. R., & Ulm, K. (2012). 26 Predictive modeling of gene expression data. Handb Stat Clin Oncol, 4, 71. DOI: https://doi.org/10.1201/b11800-31
Hoerl, A., & Kennard, R. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55-67. DOI: https://doi.org/10.1080/00401706.1970.10488634
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. Paper presented at the International Joint Conference on Artificial Intelligence.
Ryan, T. (2008). Modern regression methods (Vol. 655): John Wiley & Sons. DOI: https://doi.org/10.1002/9780470382806
Scheetz, T. E., Kim, K.-Y. A., Swiderski, R. E., Philp, A. R., Braun, T. A., Knudtson, K. L., . . . Casavant, T. L. (2006). Regulation of gene expression in the mammalian eye and its relevance to eye disease. Proceedings of the National Academy of Sciences, 103(39), 14429-14434. DOI: https://doi.org/10.1073/pnas.0602562103
Stamey, T. A., Warrington, J. A., Caldwell, M. C., Chen, Z., Fan, Z., Mahadevappa, M., . . . Zhang, Z. (2001). Molecular genetic profiling of Gleason grade 4/5 prostate cancers compared to benign prostatic hyperplasia. The Journal of urology, 166(6), 2171-2177. DOI: https://doi.org/10.1016/S0022-5347(05)65528-0
Tan, Q. E. A. (2012). Correlation adjusted penalization in regression analysis. (Ph.D.), University of Manitoba Canada.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology, 58(1), 267-288. DOI: https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
Tutz, G., & Ulbricht, J. (2009). Penalized regression with correlation-based penalty. Statistics and Computing, 19, 239-253. DOI: https://doi.org/10.1007/s11222-008-9088-5
Wang, X., Dunson, D., & Leng, C. (2016). No penalty no tears: Least squares in high-dimensional linear models. Paper presented at the International Conference on Machine Learning.
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. Annals of Statistics, 101 DOI: https://doi.org/10.1214/09-AOS729
-1429.
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(2), 301-320. DOI: https://doi.org/10.1111/j.1467-9868.2005.00503.x
Copyright (c) 2025 FUDMA JOURNAL OF SCIENCES
![Creative Commons License](http://i.creativecommons.org/l/by/4.0/88x31.png)
This work is licensed under a Creative Commons Attribution 4.0 International License.
FUDMA Journal of Sciences