ROBERTaBART_X: A HYBRID TRANSFORMER MODEL FOR ENHANCING AUTOMATED CODE GENERATION
Keywords:
Automated Code Generation, Transformer Models, RoBERTa, BART, Hybrid ArchitecturesAbstract
The use of automated code generation (ACG) has been a significant aspect of the software engineering process, enabling the production of code with greater speed and precision. However, many issues, such as the absence of long-term context, poor debugging, lack of domain adaptation, and functional inaccuracy, remain in the field of Automatic code generation. Even though its impact on Software engineering is apparently huge, these issues continue to exist. The model proposed herein, RoBERTaBART_X, is a hybrid transformer model based on RoBERTa and BART, supplemented by task-adaptive pretraining (TAPT), domain-specific data augmentation (DA), retrieval-augmented generation (RAG), FlashAttention, and sparse attention. The experiments were performed on standard datasets, including CoNaLa, Django, CodeSearchNet, and HumanEval, and were evaluated using BLEU, CodeBLEU, Exact Match Accuracy, Syntax Validity, and Execution Accuracy. The experiment results show that it outperforms all the baseline models of CodeBERT, CodeT5, RoBERTaMarian, and RoBERTaBART in semantic correctness, syntactic validity, execution success, CodeBLEU, and Pass@k. Most interestingly, RoBERTaBART_X achieves +6.1 BLEU and +6.6% Execution Accuracy on coNaLa, +4.8% Execution Accuracy on Django, and +3.2 % on CodeBLEU on codeSearchNet, demonstrating itself to be a strong competitor across diverse tasks. Given these findings, we recommend RoBERTaBART_X as the highest-performing model for generating resilient executable code to date. We believe that stacking strong encoders on top of autoregressive decoders and training them in a special way has the potential to push the already advanced automated code generation research even further.
Published
How to Cite
Issue
Section
Copyright (c) 2025 Adedayo Ajibade, Olaniyan Olatayo Moses

This work is licensed under a Creative Commons Attribution 4.0 International License.