REGRESSION ESTIMATION AND FEATURE SELECTION USING MODIFIED CORRELATION-ADJUSTED ELASTIC NET PENALTIES

  • Olayiwola Babarinsa Federal University Lokoja https://orcid.org/0000-0002-3569-0828
  • Helen Edogbanya Department of Mathematics, Federal University Lokoja, P.M.B. 1154, Lokoja, Nigeria
  • Ovye Abari Department of Computer Science, Federal University Lokoja, P.M.B. 1154, Lokoja, Nigeria
  • Isaac Adeniyi Department of Statistics, Federal University Lokoja, P.M.B. 1154, Lokoja, Nigeria
Keywords: Variable selection, Regularization, High-dimensional data, Grouping effect, Machine Learning, LASSO

Abstract

Regularized regression techniques such as the least absolute shrinkage and selection operator (LASSO), elastic-net, and the type 1 and type 2 correlation adjusted elastic-net (CAEN1 and CAEN2 respectively) are used for simultaneously carrying out variable selection and estimation of coefficients in machine learning. Modified estimators based on the CAEN1 and CAEN2 are proposed in this study by rescaling the estimates to undo the double shrinkage incurred due to the application of two penalties. The scale factors are derived by decomposing the correlation matrix of the predictors. The derived scale factors, which depend on the magnitude of correlations among the predictors, ensure that the elastic-net is included as a special case. Estimation is carried out using a robust worst-case quadratic solver algorithm. Simulations show that the proposed estimators referred to as corrected correlation adjusted elastic-net (CCAEN1 and CCAEN2) perform competitively with the CAEN1, CAEN2, LASSO, and elastic-net in terms of variable selection, estimation and prediction accuracy with CCAEN1 yielding the best results when the number of predictors is more than the number of observations and CCAEN2 producing the best performance when there is grouping effect, where highly correlated predictors tend to be included in or excluded from the model together. Applications to two real-life datasets further demonstrate the advantage of the proposed methods for machine learning.

Author Biography

Olayiwola Babarinsa, Federal University Lokoja

Department of Mathematics

Federal University Lokoja

P.M.B 1154

References

Algamal, Z. Y. (2015). Penalized poisson regression model using adaptive modified elastic net penalty. Electronic Journal of Applied Statistical Analysis, 8(2), 236-245.

Anbari, M. E., & Mkhadri, A. (2014). Penalized regression combining the L 1 norm and a correlation based penalty. Sankhya B, 76, 82-102. DOI: https://doi.org/10.1007/s13571-013-0065-4

Babarinsa, O., Sofi, A. Z. M., Mohd, A. H., Eluwole, A., Sunday, I., Adamu, W., Daniel, L. (2022). Note on the history of (square) matrix and determinant. FUDMA JOURNAL OF SCIENCES, 6(3), 177-190. DOI: https://doi.org/10.33003/fjs-2022-0603-775

Beck, A., & Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences, 2(1), 183-202. DOI: https://doi.org/10.1137/080716542

Biecek, P., & Burzykowski, T. (2021). Explanatory model analysis: explore, explain, and examine predictive models: Chapman and Hall/CRC. DOI: https://doi.org/10.1201/9780429027192

Bondell, H. D., & Reich, B. J. (2006). Simultaneous regression shrinkage, variable selection and clustering of predictors with OSCAR. Retrieved from

Breiman, L. (1996). Heuristics of instability and stabilization in model selection. The annals of statistics, 24(6), 2350-2383. DOI: https://doi.org/10.1214/aos/1032181158

Efron, B., & Tibshirani, R. (1997). Improvements on cross-validation: the 632+ bootstrap method. Journal of the American statistical Association, 92(438), 548-560. DOI: https://doi.org/10.1080/01621459.1997.10474007

Efron, B., & Tibshirani, R. J. (1994). An introduction to the bootstrap: Chapman and Hall/CRC. DOI: https://doi.org/10.1201/9780429246593

Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American statistical Association, 96(456), 1348-1360. DOI: https://doi.org/10.1198/016214501753382273

Fan, J., & Li, R. (2006). Statistical challenges with high dimensionality: Feature selection in knowledge discovery. arXiv preprint math/0602133.

Fu, W. J. (1998). Penalized regressions: the bridge versus the lasso. Journal of computational and graphical statistics, 7(3), 397-416. DOI: https://doi.org/10.1080/10618600.1998.10474784

Garba, W., Yahya, G., & Aremu, M. (2016). Multiclass Sequential Feature Selection and Classification Method for Genomic Data. Blood, 7(10).

Grandvalet, Y., Chiquet, J., & Ambroise, C. (2012). Sparsity by Worst-Case Penalties. arXiv preprint arXiv:1210.2077.

Hanke, M., Dijkstra, L., Foraita, R., & Didelez, V. (2024). Variable selection in linear regression models: Choosing the best subset is not always the best choice. Biometrical Journal, 66(1), 2200209. DOI: https://doi.org/10.1002/bimj.202200209

Hapfelmeier, A., Babatunde, W., Yahya, R. R., & Ulm, K. (2012). 26 Predictive modeling of gene expression data. Handb Stat Clin Oncol, 4, 71. DOI: https://doi.org/10.1201/b11800-31

Hoerl, A., & Kennard, R. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55-67. DOI: https://doi.org/10.1080/00401706.1970.10488634

Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. Paper presented at the International Joint Conference on Artificial Intelligence.

Ryan, T. (2008). Modern regression methods (Vol. 655): John Wiley & Sons. DOI: https://doi.org/10.1002/9780470382806

Scheetz, T. E., Kim, K.-Y. A., Swiderski, R. E., Philp, A. R., Braun, T. A., Knudtson, K. L., . . . Casavant, T. L. (2006). Regulation of gene expression in the mammalian eye and its relevance to eye disease. Proceedings of the National Academy of Sciences, 103(39), 14429-14434. DOI: https://doi.org/10.1073/pnas.0602562103

Stamey, T. A., Warrington, J. A., Caldwell, M. C., Chen, Z., Fan, Z., Mahadevappa, M., . . . Zhang, Z. (2001). Molecular genetic profiling of Gleason grade 4/5 prostate cancers compared to benign prostatic hyperplasia. The Journal of urology, 166(6), 2171-2177. DOI: https://doi.org/10.1016/S0022-5347(05)65528-0

Tan, Q. E. A. (2012). Correlation adjusted penalization in regression analysis. (Ph.D.), University of Manitoba Canada.

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology, 58(1), 267-288. DOI: https://doi.org/10.1111/j.2517-6161.1996.tb02080.x

Tutz, G., & Ulbricht, J. (2009). Penalized regression with correlation-based penalty. Statistics and Computing, 19, 239-253. DOI: https://doi.org/10.1007/s11222-008-9088-5

Wang, X., Dunson, D., & Leng, C. (2016). No penalty no tears: Least squares in high-dimensional linear models. Paper presented at the International Conference on Machine Learning.

Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. Annals of Statistics, 101 DOI: https://doi.org/10.1214/09-AOS729

-1429.

Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(2), 301-320. DOI: https://doi.org/10.1111/j.1467-9868.2005.00503.x

Published
2025-01-31
How to Cite
BabarinsaO., EdogbanyaH., AbariO., & AdeniyiI. (2025). REGRESSION ESTIMATION AND FEATURE SELECTION USING MODIFIED CORRELATION-ADJUSTED ELASTIC NET PENALTIES. FUDMA JOURNAL OF SCIENCES, 9(1), 29 - 40. https://doi.org/10.33003/fjs-2025-0901-2774

Most read articles by the same author(s)

  • Olayiwola Babarinsa, Azfi Zaidi Mohammad Sofi, Asrul Hery Mohd, Akinola Eluwole, Imoni Sunday, Wakili Adamu , Benson Onojhojobi, Momoh Sheidu, Rotimi Kehinde, Lanlege Daniel, Shobanke Dolapo, Helen Olaronke Edogbanya, Anselm Oyem, Isaac Adeniyi, Chinenye Ezenweke, Sani Umaru, Sabastine Emmanuel , Kefas Bitrus, Veronica Cyril-Okeme,, Eunice Upahi, Emmanuel Akaligwo Akaligwo, Luke Arinze, Simon Barguma, Friday Edibo, Mayowa Atteh, Alloy Idoko, Enoch Opeyemi, Mogbademu Adesanmi , Olaitan Ojo, Akeem Disu, Tajudeen Adeeko , Geraldine Anukwu, Damilola Samson, Kunle Ogunleye, Jude Koffa, ojonubah James, Marut Musa, Niri Martha Choji, Helen Oluyemisi Emeka, Adamu Umar, Mansur Hassan, Oluwaseyi Jaiyeoba, Shamsoudine Aidara, Mamuda Mamman, Edna Manga, Amiru Sule, Osagie Uyimwen, Abdullah-al-Musa Ahmed, Zaku Garba, Zaato Gbene, Sanusi Jubri, Godwin Okeke, NOTE ON THE HISTORY OF (SQUARE) MATRIX AND DETERMINANT , FUDMA JOURNAL OF SCIENCES: Vol. 6 No. 3 (2022): FUDMA Journal of Sciences - Vol. 6 No. 3