Awesome-Repo-Level-Code-Generation

Awesome-Repo-Level-Code-Generation

Must-read papers on Repository-level Code Generation & Issue Resolution πŸ”₯

Stars: 163

Visit
 screenshot

This repository contains a collection of tools and scripts for generating code at the repository level. It provides a set of utilities to automate the process of creating and managing code across multiple files and directories. The tools included in this repository aim to improve code generation efficiency and maintainability by streamlining the development workflow. With a focus on enhancing productivity and reducing manual effort, this collection offers a variety of code generation options and customization features to suit different project requirements.

README:

πŸ€–βœ¨ Awesome Repository-Level Code Generation βœ¨πŸ€–

🌟 A curated list of awesome repository-level code generation research papers and resources. If you want to contribute to this list (please do), feel free to send me a pull request. πŸš€ If you have any further questions, feel free to contact Yuling Shi or Xiaodong Gu (SJTU).

πŸ“š Contents

πŸ’₯ Repo-Level Issue Resolution

  • SWE-Exp: Experience-Driven Software Issue Resolution [2025-07-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution [2025-07-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • SWE-Effi: Re-Evaluating Software AI Agent System Effectiveness Under Resource Constraints [2025-09-arXiv] [πŸ“„ paper]

  • Diffusion is a code repair operator and generator [2025-08-arXiv] [πŸ“„ paper]

  • The SWE-Bench Illusion: When State-of-the-Art LLMs Remember Instead of Reason [2025-06-arXiv] [πŸ“„ paper]

  • Agent-RLVR: Training Software Engineering Agents via Guidance and Environment Rewards [2025-06-arXiv] [πŸ“„ paper]

  • EXPEREPAIR: Dual-Memory Enhanced LLM-based Repository-Level Program Repair [2025-06-arXiv] [πŸ“„ paper]

  • Coding Agents with Multimodal Browsing are Generalist Problem Solvers [2025-06-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • CoRet: Improved Retriever for Code Editing [2025-05-arXiv] [πŸ“„ paper]

  • Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents [2025-05-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development [2025-05-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • Putting It All into Context: Simplifying Agents with LCLMs [2025-05-arXiv] [πŸ“„ paper]

  • SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning [2025-05-arXiv] [πŸ“„ blog] [πŸ”— repo]

  • AEGIS: An Agent-based Framework for General Bug Reproduction from Issue Descriptions [2025-FSE] [πŸ“„ paper]

  • Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [2025-03-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • Enhancing Repository-Level Software Repair via Repository-Aware Knowledge Graphs [2025-03-arXiv] [πŸ“„ paper]

  • CoSIL: Software Issue Localization via LLM-Driven Code Repository Graph Searching [2025-03-arXiv] [πŸ“„ paper]

  • SEAlign: Alignment Training for Software Engineering Agent [2025-03-arXiv] [πŸ“„ paper]

  • DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal [2025-03-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • LocAgent: Graph-Guided LLM Agents for Code Localization [2025-03-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • SoRFT: Issue Resolving with Subtask-oriented Reinforced Fine-Tuning [2025-02-arXiv] [πŸ“„ paper]

  • SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution [2025-02-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution [2025-01-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • CodeMonkeys: Scaling Test-Time Compute for Software Engineering [2025-01-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • Training Software Engineering Agents and Verifiers with SWE-Gym [2024-12-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • CODEV: Issue Resolving with Visual Data [2024-12-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • LLMs as Continuous Learners: Improving the Reproduction of Defective Code in Software Issues [2024-11-arXiv] [πŸ“„ paper]

  • Globant Code Fixer Agent Whitepaper [2024-11] [πŸ“„ paper]

  • MarsCode Agent: AI-native Automated Bug Fixing [2024-11-arXiv] [πŸ“„ paper]

  • Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement [2024-11-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement [2024-10-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • AutoCodeRover: Autonomous Program Improvement [2024-09-ISSTA] [πŸ“„ paper] [πŸ”— repo]

  • SpecRover: Code Intent Extraction via LLMs [2024-08-arXiv] [πŸ“„ paper]

  • OpenHands: An Open Platform for AI Software Developers as Generalist Agents [2024-07-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • AGENTLESS: Demystifying LLM-based Software Engineering Agents [2024-07-arXiv] [πŸ“„ paper]

  • RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph [2024-07-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • CodeR: Issue Resolving with Multi-Agent and Task Graphs [2024-06-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • Alibaba LingmaAgent: Improving Automated Issue Resolution via Comprehensive Repository Exploration [2024-06-arXiv] [πŸ“„ paper]

  • SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering [2024-NeurIPS] [πŸ“„ paper] [πŸ”— repo]

πŸ€– Repo-Level Code Completion

  • Enhancing Project-Specific Code Completion by Inferring Internal API Information [2025-07-TSE] [πŸ“„ paper] [πŸ”— repo]

  • CodeRAG: Supportive Code Retrieval on Bigraph for Real-World Code Generation [2025-04-arXiv] [πŸ“„ paper]

  • CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases [2025-04-NAACL] [πŸ“„ paper]

  • RTLRepoCoder: Repository-Level RTL Code Completion through the Combination of Fine-Tuning and Retrieval Augmentation [2025-04-arXiv] [πŸ“„ paper]

  • Hierarchical Context Pruning: Optimizing Real-World Code Completion with Repository-Level Pretrained Code LLMs [2025-04-AAAI] [πŸ“„ paper] [πŸ”— repo]

  • What to Retrieve for Effective Retrieval-Augmented Code Generation? An Empirical Study and Beyond [2025-03-arXiv] [πŸ“„ paper]

  • REPOFILTER: Adaptive Retrieval Context Trimming for Repository-Level Code Completion [2025-04-OpenReview] [πŸ“„ paper]

  • Improving FIM Code Completions via Context & Curriculum Based Learning [2024-12-arXiv] [πŸ“„ paper]

  • ContextModule: Improving Code Completion via Repository-level Contextual Information [2024-12-arXiv] [πŸ“„ paper]

  • A^3-CodGen: A Repository-Level Code Generation Framework for Code Reuse With Local-Aware, Global-Aware, and Third-Party-Library-Aware [2024-12-TSE] [πŸ“„ paper]

  • RepoGenReflex: Enhancing Repository-Level Code Completion with Verbal Reinforcement and Retrieval-Augmented Generation [2024-09-arXiv] [πŸ“„ paper]

  • RAMBO: Enhancing RAG-based Repository-Level Method Body Completion [2024-09-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • RLCoder: Reinforcement Learning for Repository-Level Code Completion [2024-07-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • STALL+: Boosting LLM-based Repository-level Code Completion with Static Analysis [2024-06-arXiv] [πŸ“„ paper]

  • GraphCoder: Enhancing Repository-Level Code Completion via Code Context Graph-based Retrieval and Language Model [2024-06-arXiv] [πŸ“„ paper]

  • Enhancing Repository-Level Code Generation with Integrated Contextual Information [2024-06-arXiv] [πŸ“„ paper]

  • R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models [2024-06-arXiv] [πŸ“„ paper]

  • Natural Language to Class-level Code Generation by Iterative Tool-augmented Reasoning over Repository [2024-05-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback [2024-03-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • Repoformer: Selective Retrieval for Repository-Level Code Completion [2024-03-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • RepoHyper: Search-Expand-Refine on Semantic Graphs for Repository-Level Code Completion [2024-03-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • RepoMinCoder: Improving Repository-Level Code Generation Based on Information Loss Screening [2024-07-Internetware] [πŸ“„ paper]

  • CodePlan: Repository-Level Coding using LLMs and Planning [2024-07-FSE] [πŸ“„ paper] [πŸ”— repo]

  • DraCo: Dataflow-Guided Retrieval Augmentation for Repository-Level Code Completion [2024-05-ACL] [πŸ“„ paper] [πŸ”— repo]

  • RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation [2023-10-EMNLP] [πŸ“„ paper] [πŸ”— repo]

  • Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context [2023-09-NeurIPS] [πŸ“„ paper] [πŸ”— repo]

  • RepoFusion: Training Code Models to Understand Your Repository [2023-06-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • Repository-Level Prompt Generation for Large Language Models of Code [2023-06-ICML] [πŸ“„ paper] [πŸ”— repo]

  • Fully Autonomous Programming with Large Language Models [2023-06-GECCO] [πŸ“„ paper] [πŸ”— repo]

πŸ”„ Repo-Level Code Translation

  • A Systematic Literature Review on Neural Code Translation [2025-05-arXiv] [πŸ“„ paper]

  • EVOC2RUST: A Skeleton-guided Framework for Project-Level C-to-Rust Translation [2025-08-arXiv] [πŸ“„ paper]

  • Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code [2024-04-ICSE] [πŸ“„ paper] [πŸ”— repo]

  • C2SaferRust: Transforming C Projects into Safer Rust with NeuroSymbolic Techniques [2025-01-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • Scalable, Validated Code Translation of Entire Projects using Large Language Models [2025-06-PLDI] [πŸ“„ paper]

  • Syzygy: Dual Code-Test C to (safe) Rust Translation using LLMs and Dynamic Analysis [2024-12-arxiv] [πŸ“„ paper] [πŸ•ΈοΈ website]

πŸ§ͺ Repo-Level Unit Test Generation

  • Execution-Feedback Driven Test Generation from SWE Issues [2025-08-arXiv] [πŸ“„ paper]

  • AssertFlip: Reproducing Bugs via Inversion of LLM-Generated Passing Tests [2025-07-arXiv] [πŸ“„ paper]

  • Issue2Test: Generating Reproducing Test Cases from Issue Reports [2025-03-arXiv] [πŸ“„ paper]

  • Agentic Bug Reproduction for Effective Automated Program Repair at Google [2025-02-arXiv] [πŸ“„ paper]

  • LLMs as Continuous Learners: Improving the Reproduction of Defective Code in Software Issues [2024-11-arXiv] [πŸ“„ paper]

πŸ” Repo-Level Code QA

  • SWE-QA: Can Language Models Answer Repository-level Code Questions? [2025-09-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • Decompositional Reasoning for Graph Retrieval with Large Language Models [2025-06-arXiv] [πŸ“„ paper]

  • LongCodeBench: Evaluating Coding LLMs at 1M Context Windows [2025-05-arXiv] [πŸ“„ paper]

  • LocAgent: Graph-Guided LLM Agents for Code Localization [2025-03-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • CoReQA: Uncovering Potentials of Language Models in Code Repository Question Answering [2025-01-arXiv] [πŸ“„ paper]

  • RepoChat Arena [2025-Blog] [πŸ”— repo]

  • RepoChat: An LLM-Powered Chatbot for GitHub Repository Question-Answering [MSR-2025] [πŸ”— repo]

  • CodeQueries: A Dataset of Semantic Queries over Code [2022-09-arXiv] [πŸ“„ paper]

πŸ‘©β€πŸ’» Repo-Level Issue Task Synthesis

πŸ“Š Datasets and Benchmarks

  • SWE-QA: Can Language Models Answer Repository-level Code Questions? [2025-09-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • SWE-bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks? [2025-09] [πŸ“„ paper] [πŸ”— repo]

  • AutoCodeBench: AutoCodeBench: Large Language Models are Automatic Code Benchmark Generators [2025-08-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • LiveRepoReflection: Turning the Tide: Repository-based Code Reflection [2025-07-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • SWE-Perf: SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories? [2025-07-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code [2025-06-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks [2025-06-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench [ACL-2025] [πŸ“„ paper]

  • SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven Manner [ICML-2025] [πŸ“„ paper] [πŸ”— repo]

  • AgentIssue-Bench: Can Agents Fix Agent Issues? [2025-08-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • OmniGIRL: OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution [2025-05-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • SWE-Smith: Scaling Data for Software Engineering Agents [2025-04-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • SWE-Synth: Synthesizing Verifiable Bug-Fix Data to Enable Large Language Models in Resolving Real-World Bugs [2025-04-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • Are "Solved Issues" in SWE-bench Really Solved Correctly? An Empirical Study [2025-03-arXiv] [πŸ“„ paper]

  • Unveiling Pitfalls: Understanding Why AI-driven Code Agents Fail at GitHub Issue Resolution [2025-03-arXiv] [πŸ“„ paper]

  • SWE-Lancer: Can Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? [2025-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • Evaluating Agent-based Program Repair at Google [2025-01-arXiv] [πŸ“„ paper]

  • SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents [2025-05-arXiv] [πŸ“„ paper] [πŸ•ΈοΈ website]

  • SWE-bench-Live: A Live Benchmark for Repository-Level Issue Resolution [2025-05-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation [2025-05-ACL] [πŸ“„ paper] [πŸ”— repo]

  • OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution [2025-05-ISSTA] [πŸ“„ paper]

  • SWE-PolyBench: A multi-language benchmark for repository level evaluation of coding agents [2025-04-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving [2025-04-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • LibEvolutionEval: A Benchmark and Study for Version-Specific Code Generation [2025-04-NAACL] [πŸ“„ paper][πŸ”— Website]

  • SWEE-Bench & SWA-Bench: Automated Benchmark Generation for Repository-Level Coding Tasks [2025-03-arXiv] [πŸ“„ paper]

  • ProjectEval: A Benchmark for Programming Agents Automated Evaluation on Project-Level Code Generation [2025-03-arXiv] [πŸ“„ paper]

  • REPOST-TRAIN: Scalable Repository-Level Coding Environment Construction with Sandbox Testing [2025-03-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • Loc-Bench: Graph-Guided LLM Agents for Code Localization [2025-03-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • SWE-Lancer: Can Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? [2025-02-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • SolEval: Benchmarking Large Language Models for Repository-level Solidity Code Generation [2025-02-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • HumanEvo: An Evolution-aware Benchmark for More Realistic Evaluation of Repository-level Code Generation [2025-ICSE] [πŸ“„ paper] [πŸ”— repo]

  • RepoExec: On the Impacts of Contexts on Repository-Level Code Generation [2025-NAACL] [πŸ“„ paper] [πŸ”— repo]

  • SWE-Gym: Training Software Engineering Agents and Verifiers with SWE-Gym [2024-12-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • RepoTransBench: RepoTransBench: A Real-World Benchmark for Repository-Level Code Translation [2024-12-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • Visual SWE-bench: Issue Resolving with Visual Data [2024-12-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • ExecRepoBench: Multi-level Executable Code Completion Evaluation [2024-12-arXiv] [πŸ“„ paper] [πŸ”— site]

  • REPOCOD: Can Language Models Replace Programmers? REPOCOD Says 'Not Yet' [2024-10-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • M2RC-EVAL: M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation [2024-10-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • SWE-bench+: Enhanced Coding Benchmark for LLMs [2024-10-arXiv] [πŸ“„ paper]

  • SWE-bench Multimodal: Multimodal Software Engineering Benchmark [2024-10-arXiv] [πŸ“„ paper] [πŸ”— site]

  • Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [2024-10-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • SWT-Bench: Testing and Validating Real-World Bug-Fixes with Code Agents [2024-06-arxiv] [πŸ“„ paper] [πŸ•ΈοΈ website]

  • CodeRAG-Bench: Can Retrieval Augment Code Generation? [2024-06-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • R2C2-Bench: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models [2024-06-arXiv] [πŸ“„ paper]

  • RepoClassBench: Class-Level Code Generation from Natural Language Using Iterative, Tool-Enhanced Reasoning over Repository [2024-05-arXiv] [πŸ“„ paper] [πŸ”— repo]

  • DevEval: Evaluating Code Generation in Practical Software Projects [2024-ACL-Findings] [πŸ“„ paper] [πŸ”— repo]

  • CodAgentBench: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges [2024-ACL] [πŸ“„ paper]

  • RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems [2024-ICLR] [πŸ“„ paper] [πŸ”— repo]

  • SWE-bench: Can Language Models Resolve Real-World GitHub Issues? [2024-ICLR] [πŸ“„ paper] [πŸ”— repo]

  • CrossCodeLongEval: Repoformer: Selective Retrieval for Repository-Level Code Completion [2024-ICML] [πŸ“„ paper] [πŸ”— repo]

  • R2E-Eval: Turning Any GitHub Repository into a Programming Agent Test Environment [2024-ICML] [πŸ“„ paper] [πŸ”— repo]

  • RepoEval: Repository-Level Code Completion Through Iterative Retrieval and Generation [2023-EMNLP] [πŸ“„ paper] [πŸ”— repo]

  • CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion [2023-NeurIPS] [πŸ“„ paper] [πŸ”— site]

  • Skeleton-Guided-Translation: A Benchmarking Framework for Code Repository Translation with Fine-Grained Quality Evaluation [2025-01-arxiv] [πŸ“„ paper] [πŸ”— repo]

  • SWE-Dev: SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development [2025-05-arXiv] [πŸ“„ paper] [πŸ”— repo]

Star History

Star History Chart

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for Awesome-Repo-Level-Code-Generation

Similar Open Source Tools

For similar tasks

For similar jobs