最新AI研究

发现前沿AI资源

探索来自arXiv、HuggingFace和GitHub的最新研究论文、模型、应用和项目。您的全面AI导航中心。

最新论文

来自arXiv的最新研究论文

查看全部

ShortCoder: Knowledge-Augmented Syntax Optimization for Token-Efficient Code Generation

Jan 14, 2026
9 authors

Code generation tasks aim to automate the conversion of user requirements into executable code, significantly reducing manual development efforts and enhancing software productivity. The emergence of large language models (LLMs) has significantly advanced code generation, though their efficiency is still impacted by certain inherent architectural constraints. Each token generation necessitates a complete inference pass, requiring persistent retention of contextual information in memory and escalating resource consumption. While existing research prioritizes inference-phase optimizations such as prompt compression and model quantization, the generation phase remains underexplored. To tackle these challenges, we propose a knowledge-infused framework named ShortCoder, which optimizes code generation efficiency while preserving semantic equivalence and readability. In particular, we introduce: (1) ten syntax-level simplification rules for Python, derived from AST-preserving transformations, achieving 18.1% token reduction without functional compromise; (2) a hybrid data synthesis pipeline integrating rule-based rewriting with LLM-guided refinement, producing ShorterCodeBench, a corpus of validated tuples of original code and simplified code with semantic consistency; (3) a fine-tuning strategy that injects conciseness awareness into the base LLMs. Extensive experimental results demonstrate that ShortCoder consistently outperforms state-of-the-art methods on HumanEval, achieving an improvement of 18.1%-37.8% in generation efficiency over previous methods while ensuring the performance of code generation.

cs.SEcs.AIcs.CL

精选博客文章

来自我们团队的最新见解和教程

查看全部

热门AI工具

发现最佳AI工具和替代品

按类别浏览

按主题探索AI研究