A MULTI-HYBRID FRAMEWORK FOR IMAGE DATA AUGMENTATION IN A RANDOMIZED SETTING

Peter Jacob Adoyi; Samuel Oluwatosin Hassan; Samera Otor; Adekunle Adedotun Adeyelu

doi:10.33003/fjs-2026-1002-4338

Authors

Peter Jacob Adoyi Rev. Fr. Moses Orshio Adasu University, Makurdi
Samuel Oluwatosin Hassan Olabisi Onabanjo University, Ago-Iwoye
Samera Otor Rev. Fr. Moses Orshio Adasu (former Benue State) University, Makurdi, Nigeria
Adekunle Adedotun Adeyelu

DOI:

https://doi.org/10.33003/fjs-2026-1002-4338

Keywords:

Image Augmentation, Hybrid Transformations, Randomized Techniques, Machine Learning Bencharkng, Model Independent Framework, Data Diversity/Similarity, Overfitting, KNN, SVM, ResNet50

Abstract

Image Data Augmentation (IDA) is fundamental for improving model robustness and reducing overfitting; however, many existing approaches are tightly integrated in specific deep learning models, limiting their applicability primarily to classification tasks. This study proposes a model-agnostic, stand-alone multi-hybrid IDA framework that operates independently of any machine learning (ML) model. The framework is implemented as a Python-based system integrating five common transformation techniques: rotation, translation, zooming, flipping, and color-space manipulation. The proposed Randomized Framework was benchmarked against three established augmentation pipelines with a Sequential mode serving as an internal baseline. Performance evaluation employed dataset diversity, class preservation, execution time, and classifier validation using K-Nearest Neighbour (KNN) and Support Vector Machine (SVM) models on features extracted via ResNet-50. Results demonstrate that the Randomized Framework achieved the highest diversity score (0.3767) with optimal execution time (5.20 s), alongside superior generalization evidenced by the lowest overfitting gap (0.03) and strong classification performance (AUC = 0.9596 for KNN). Classification accuracy improved from 0.91–0.93 to 0.95–1.00 after augmentation. The study concludes that randomized multi-hybrid augmentation offers an effective balance between diversity and efficiency and, due to its specific machine learning model- independent design, it is well suited for diverse computer vision applications. Future work will investigate additional transformations, deep learning–based augmentation pipelines, and scalable batch-level optimization.

Author Biographies

Samuel Oluwatosin Hassan, Olabisi Onabanjo University, Ago-Iwoye

Department of Computer Sciences
Samera Otor, Rev. Fr. Moses Orshio Adasu (former Benue State) University, Makurdi, Nigeria

Dr. Samara Utor, Senior Lecturer, Department of Mathematics/Computer Science
Adekunle Adedotun Adeyelu

Professor of Distributed Systems, HOD, Department of Mathematics/Computer Science

References

Adobe. (2025). WebP files explained | Google's web image format. Retrieved August 31, 2025, from https://www.adobe.com/creativecloud/file-types/image/raster/webp-file.html

Alpaydin, E. (2010). Introduction to Machine Learning (2nd ed.). The MIT Press. Cambridge, MA; London, England. ISBN: 978-0-262-01243-0.

Chlap, P., Min, H., Vandenberg, N., Dowling, J., Holloway, L., & Haworth, A. (2021). A review of medical image data augmentation techniques for deep learning applications. Journal of Medical Imaging and Radiation Oncology, 65(5), 545-563. https://doi.org/10.1111/1754- 9485.13261

Deng, A., & Shi, X. (2016, August). Data-Driven Metric Development for Online Controlled Experiments: Seven Lessons Learned. In Proceedings of the 22nd ACM SIGKDD International Conference. DOI:10.1145/2939672.2939700.

Dilmegani, C. (2023, December 4). Top data augmentation techniques: Ultimate guide for 2024. https://research.aimultiple.com/data-augmentation-techniques/

du Toit, J. S., du Toit, J. V., & Kruger, H. A. (2019). Heuristic data augmentation for improved human activity recognition. In Proceedings of the Southern Africa Telecommunication Networks and Applications Conference (SATNAC), 2019 (pp. 264-269), KwaZulu-Natal, South Africa.

Gasmi, K., Ben Ammar, L., Krichen, M., Alamro, M. A., Mihoub, A., & Mrabet, M. (2024). Optimal Ensemble Learning Model for Dyslexia Prediction Based on an Adaptive Genetic Algorithm. IEEE Access, 12, 64754-64764. https://doi.org/10.1109/ACCESS.2024.3395803.

Geron, A. (2019). Hands-On Machine Learning with SciKit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems (2nd ed.). O’Reilly Media.

Han, F., Miao, Y., Sun, Z., & Wei, Y. (2023). T-ADAF: Adaptive data Augmentation framework for image classification network based on tensor t-product operator. Neural Processing Letters, 55, 10993–11016. https://doi.org/10.1007/s11063-023-11361-7

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016, pp. 770-778.

Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., … Adam, H. (2017). MobileNets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861. Retrieved from http://arxiv.org/abs/1704.04861

Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K. Q. (2017). Densely connected convolutional networks In Proceedings of the IEEE conference on computer vision and pattern recognition 2017, pp. 4700-4708

Jackson, P. T. G., Abarghouei, A. A., Bonner, S., Breckon, T. P., Obara, B. (2019). Style augmentation: Data augmentation via style randomization. In IEEE conference on computer vision and pattern recognition workshops, CVPR workshops 2019 (pp. 83-92), Computer Vision Foundation/IEEE, Long Beach, CA, USA.

Kohavi, R., Longbotham, R., Sommerfield, D., & Henne, R. M. (2009). Controlled Experiments on the Web: Survey and Practical Guide. Data Mining and Knowledge Discovery, 18(1), 140-181.

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). COPY ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1106–1114. doi:10.1145/3065386

Larsen, N., Stallrich, J., Sengupta, S., Deng, A., Kohavi, R., & Stevens, N. T. (2023). Statistical Challenges in Online Controlled Experiments: A Review of A/B Testing Methodology. The American Statistician, 77(3), 308-319. https://doi.org/10.1080/0031305.2023.2257237

Ly, C., Jorgenson, G., Dan de Jesus, D. R., Kvinge, H., Attarian, A., & Watkins, Y. (2023). ColMix: A simple data augmentation framework to improve object detector performance and robustness in aerial images. arXiv:2305.13509v1, Retrieved from: https://doi.org/10.48550/arXiv.2305.13509

Montgomery, D.C. (2019). Design and Analysis of Experiments (10th ed.). Wiley.

Moreno-Barea, F., Jerez, J., & Franco, L. (2020). Improving Classification Accuracy Using Data Augmentation on Small Data Sets. Expert Systems with Applications, 161, 113696. https://doi.org/10.1016/j.eswa.2020.113696.

Mumuni, A., & Mumuni, F. (2022). Data augmentation: A comprehensive survey of modern approaches. Array, 16, 100258, https://doi.org/10.1016/j.array.2022.100258

Mitsuzumi, Y., Irie, G., Kimura, A., & Nakazawa, A. (2023). Phase randomization: A data augmentation for domain adaptation in human action recognition. Pattern Recognition, 145, 110051. https://doi.org/10.1016/j.patcog.2023.110051

Nanni, L., Paci, M., Brahnam, S., & Lumini, A. (2021). Comparison of different image data augmentation approaches. Journal of Imaging, 7, 254. https://doi.org/10.3390/ jimaging7120254

Nichol, J. (2023, August 22). 10 types of image file extensions and when to use them. HubSpot Inc. https://blog.hubspot.com/insiders/different-types-of-image-files

Pal., S. (2022, November 30). An intuitive guide on data augmentation in deep learning - techniques with examples. GreekPython. https://geekpython.in/data-augmentation-in- deep-learning

Raikwal, J. S., & Saxena, K. (2012, July). Performance Evaluation of SVM and K-Nearest Neighbor Algorithm over Medical Dataset. International Journal of Computer Applications, 50(14), 35-39. DOI:10.5120/7842-1055.

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real- time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788).

Reinaldo, R.N., & Dwiasnati, S. (2023). Prediction of Customer Data Classification by Company Category Using Decision Tree Algorithm (Case Study: PT. Teknik Kreasi Solusindo). International Journal of Advanced Multidisciplinary, 2, 229-238. https://doi.org/10.38035/ijam.v2i2.285.

Saxena, S. (2021, March 12). Image augmentation techniques for training deep learning models. https://www.analyticsvidhya.com/blog/2021/03/image-augmentation-techniques-for-training-deep-learning-models/

Shijie, J., Ping, W., Peiyi, J., & Siping, H. (2017). Research on data augmentation for image classification based on convolution neural networks. In Proceedings of the Chinese Automation Congress (CAC), pp 20-22, Jinan, China.

Shorten, C., & Khoshgoftaar, T. M. (2019). A survey on Image Data Augmentation for Deep Learning. Journal of Big Data, 6(60), 1-48. https://doi.org/10.1186/s40537-019-0197-0

Singh, S., & Mahmood, A. (2021). The NLP Cookbook: Modern Recipes for Transformer-Based Deep Learning Architectures. IEEE Access, 9, 68675–68702.

Taylor, L., & Nitschke, G. (2018). Improving deep learning with generic data augmentation. In Proceedings of the IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1542-1547, Bengaluru, India.

Udousoro, I. C. (2020). Machine Learning: A Review. Semiconductor Science and Information Devices, 2(2), 5–14. https://doi.org/10.30564/ssid.v2i2.1931

Xu, M., Yoonb, S., Fuentes, A., & Parkc, D.S. (2023). A Comprehensive Survey of Image Augmentation Techniques for Deep Learning. Pattern Recognition, 137(2023), 109347. https://doi.org/10.1016/j.patcog.2023.109347

Yamashita, R., Nishio, M., Do, R. K. G., & Togashi, K. (2018). Convolutional neural networks: An overview and application in radiology. Insights into imaging, 9, 611–629. https://doi.org/10.1007/s13244-018-0639-9

Zhong, Z., Zheng, L., Kang, G., Li, S., & Yang, Y. (2020). Random erasing data augmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, 34, (pp. 13001-13008).

A MULTI-HYBRID FRAMEWORK FOR IMAGE DATA AUGMENTATION IN A RANDOMIZED SETTING

Authors

DOI:

Keywords:

Abstract

Author Biographies

References

Downloads

Published

Issue

Section

Categories

License

How to Cite

Most read articles by the same author(s)

Make a Submission

Browse

Developed By

Information

Latest publications

Publication Schedule

Payment

TETFUND SUPPORTED PROJECT 2025