A MULTI-HYBRID FRAMEWORK FOR IMAGE DATA AUGMENTATION IN A RANDOMIZED SETTING
DOI:
https://doi.org/10.33003/fjs-2026-1002-4338Keywords:
Image Augmentation, Hybrid Transformations, Randomized Techniques, Machine Learning Bencharkng, Model Independent Framework, Data Diversity/Similarity, Overfitting, KNN, SVM, ResNet50Abstract
Image Data Augmentation (IDA) is fundamental for improving model robustness and reducing overfitting; however, many existing approaches are tightly integrated in specific deep learning models, limiting their applicability primarily to classification tasks. This study proposes a model-agnostic, stand-alone multi-hybrid IDA framework that operates independently of any machine learning (ML) model. The framework is implemented as a Python-based system integrating five common transformation techniques: rotation, translation, zooming, flipping, and color-space manipulation. The proposed Randomized Framework was benchmarked against three established augmentation pipelines with a Sequential mode serving as an internal baseline. Performance evaluation employed dataset diversity, class preservation, execution time, and classifier validation using K-Nearest Neighbour (KNN) and Support Vector Machine (SVM) models on features extracted via ResNet-50. Results demonstrate that the Randomized Framework achieved the highest diversity score (0.3767) with optimal execution time (5.20 s), alongside superior generalization evidenced by the lowest overfitting gap (0.03) and strong classification performance (AUC = 0.9596 for KNN). Classification accuracy improved from 0.91–0.93 to 0.95–1.00 after augmentation. The study concludes that randomized multi-hybrid augmentation offers an effective balance between diversity and efficiency and, due to its specific machine learning model- independent design, it is well suited for diverse computer vision applications. Future work will investigate additional transformations, deep learning–based augmentation pipelines, and scalable batch-level optimization.
References
Adobe. (2025). WebP files explained | Google's web image format. Retrieved August 31, 2025, from https://www.adobe.com/creativecloud/file-types/image/raster/webp-file.html
Alpaydin, E. (2010). Introduction to Machine Learning (2nd ed.). The MIT Press. Cambridge, MA; London, England. ISBN: 978-0-262-01243-0.
Chlap, P., Min, H., Vandenberg, N., Dowling, J., Holloway, L., & Haworth, A. (2021). A review of medical image data augmentation techniques for deep learning applications. Journal of Medical Imaging and Radiation Oncology, 65(5), 545-563. https://doi.org/10.1111/1754- 9485.13261
Deng, A., & Shi, X. (2016, August). Data-Driven Metric Development for Online Controlled Experiments: Seven Lessons Learned. In Proceedings of the 22nd ACM SIGKDD International Conference. DOI:10.1145/2939672.2939700.
Dilmegani, C. (2023, December 4). Top data augmentation techniques: Ultimate guide for 2024. https://research.aimultiple.com/data-augmentation-techniques/
du Toit, J. S., du Toit, J. V., & Kruger, H. A. (2019). Heuristic data augmentation for improved human activity recognition. In Proceedings of the Southern Africa Telecommunication Networks and Applications Conference (SATNAC), 2019 (pp. 264-269), KwaZulu-Natal, South Africa.
Gasmi, K., Ben Ammar, L., Krichen, M., Alamro, M. A., Mihoub, A., & Mrabet, M. (2024). Optimal Ensemble Learning Model for Dyslexia Prediction Based on an Adaptive Genetic Algorithm. IEEE Access, 12, 64754-64764. https://doi.org/10.1109/ACCESS.2024.3395803.
Geron, A. (2019). Hands-On Machine Learning with SciKit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems (2nd ed.). O’Reilly Media.
Han, F., Miao, Y., Sun, Z., & Wei, Y. (2023). T-ADAF: Adaptive data Augmentation framework for image classification network based on tensor t-product operator. Neural Processing Letters, 55, 10993–11016. https://doi.org/10.1007/s11063-023-11361-7
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016, pp. 770-778.
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., … Adam, H. (2017). MobileNets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861. Retrieved from http://arxiv.org/abs/1704.04861
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K. Q. (2017). Densely connected convolutional networks In Proceedings of the IEEE conference on computer vision and pattern recognition 2017, pp. 4700-4708
Jackson, P. T. G., Abarghouei, A. A., Bonner, S., Breckon, T. P., Obara, B. (2019). Style augmentation: Data augmentation via style randomization. In IEEE conference on computer vision and pattern recognition workshops, CVPR workshops 2019 (pp. 83-92), Computer Vision Foundation/IEEE, Long Beach, CA, USA.
Kohavi, R., Longbotham, R., Sommerfield, D., & Henne, R. M. (2009). Controlled Experiments on the Web: Survey and Practical Guide. Data Mining and Knowledge Discovery, 18(1), 140-181.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). COPY ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1106–1114. doi:10.1145/3065386
Larsen, N., Stallrich, J., Sengupta, S., Deng, A., Kohavi, R., & Stevens, N. T. (2023). Statistical Challenges in Online Controlled Experiments: A Review of A/B Testing Methodology. The American Statistician, 77(3), 308-319. https://doi.org/10.1080/0031305.2023.2257237
Ly, C., Jorgenson, G., Dan de Jesus, D. R., Kvinge, H., Attarian, A., & Watkins, Y. (2023). ColMix: A simple data augmentation framework to improve object detector performance and robustness in aerial images. arXiv:2305.13509v1, Retrieved from: https://doi.org/10.48550/arXiv.2305.13509
Montgomery, D.C. (2019). Design and Analysis of Experiments (10th ed.). Wiley.
Moreno-Barea, F., Jerez, J., & Franco, L. (2020). Improving Classification Accuracy Using Data Augmentation on Small Data Sets. Expert Systems with Applications, 161, 113696. https://doi.org/10.1016/j.eswa.2020.113696.
Mumuni, A., & Mumuni, F. (2022). Data augmentation: A comprehensive survey of modern approaches. Array, 16, 100258, https://doi.org/10.1016/j.array.2022.100258
Mitsuzumi, Y., Irie, G., Kimura, A., & Nakazawa, A. (2023). Phase randomization: A data augmentation for domain adaptation in human action recognition. Pattern Recognition, 145, 110051. https://doi.org/10.1016/j.patcog.2023.110051
Nanni, L., Paci, M., Brahnam, S., & Lumini, A. (2021). Comparison of different image data augmentation approaches. Journal of Imaging, 7, 254. https://doi.org/10.3390/ jimaging7120254
Nichol, J. (2023, August 22). 10 types of image file extensions and when to use them. HubSpot Inc. https://blog.hubspot.com/insiders/different-types-of-image-files
Pal., S. (2022, November 30). An intuitive guide on data augmentation in deep learning - techniques with examples. GreekPython. https://geekpython.in/data-augmentation-in- deep-learning
Raikwal, J. S., & Saxena, K. (2012, July). Performance Evaluation of SVM and K-Nearest Neighbor Algorithm over Medical Dataset. International Journal of Computer Applications, 50(14), 35-39. DOI:10.5120/7842-1055.
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real- time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788).
Reinaldo, R.N., & Dwiasnati, S. (2023). Prediction of Customer Data Classification by Company Category Using Decision Tree Algorithm (Case Study: PT. Teknik Kreasi Solusindo). International Journal of Advanced Multidisciplinary, 2, 229-238. https://doi.org/10.38035/ijam.v2i2.285.
Saxena, S. (2021, March 12). Image augmentation techniques for training deep learning models. https://www.analyticsvidhya.com/blog/2021/03/image-augmentation-techniques-for-training-deep-learning-models/
Shijie, J., Ping, W., Peiyi, J., & Siping, H. (2017). Research on data augmentation for image classification based on convolution neural networks. In Proceedings of the Chinese Automation Congress (CAC), pp 20-22, Jinan, China.
Shorten, C., & Khoshgoftaar, T. M. (2019). A survey on Image Data Augmentation for Deep Learning. Journal of Big Data, 6(60), 1-48. https://doi.org/10.1186/s40537-019-0197-0
Singh, S., & Mahmood, A. (2021). The NLP Cookbook: Modern Recipes for Transformer-Based Deep Learning Architectures. IEEE Access, 9, 68675–68702.
Taylor, L., & Nitschke, G. (2018). Improving deep learning with generic data augmentation. In Proceedings of the IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1542-1547, Bengaluru, India.
Udousoro, I. C. (2020). Machine Learning: A Review. Semiconductor Science and Information Devices, 2(2), 5–14. https://doi.org/10.30564/ssid.v2i2.1931
Xu, M., Yoonb, S., Fuentes, A., & Parkc, D.S. (2023). A Comprehensive Survey of Image Augmentation Techniques for Deep Learning. Pattern Recognition, 137(2023), 109347. https://doi.org/10.1016/j.patcog.2023.109347
Yamashita, R., Nishio, M., Do, R. K. G., & Togashi, K. (2018). Convolutional neural networks: An overview and application in radiology. Insights into imaging, 9, 611–629. https://doi.org/10.1007/s13244-018-0639-9
Zhong, Z., Zheng, L., Kang, G., Li, S., & Yang, Y. (2020). Random erasing data augmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, 34, (pp. 13001-13008).
Downloads
Published
Issue
Section
Categories
License
Copyright (c) 2026 Peter Jacob Adoyi, Samuel Oluwatosin Hassan, Samera Otor, Adekunle Adedotun Adeyelu

This work is licensed under a Creative Commons Attribution 4.0 International License.