AwesomeLLM4APR

AwesomeLLM4APR

A Systematic Literature Review on Large Language Models for Automated Program Repair

Stars: 88

Visit
 screenshot

Awesome LLM for APR is a repository dedicated to exploring the capabilities of Large Language Models (LLMs) in Automated Program Repair (APR). It provides a comprehensive collection of research papers, tools, and resources related to using LLMs for various scenarios such as repairing semantic bugs, security vulnerabilities, syntax errors, programming problems, static warnings, self-debugging, type errors, web UI tests, smart contracts, hardware bugs, performance bugs, API misuses, crash bugs, test case repairs, formal proofs, GitHub issues, code reviews, motion planners, human studies, and patch correctness assessments. The repository serves as a valuable reference for researchers and practitioners interested in leveraging LLMs for automated program repair.

README:

🤖 Awesome LLM for APR

đź“– Contents

đź‘Ź Citation

@article{zhang2024survey,
  title={A Systematic Literature Review on Large Language Models for Automated Program Repair},
  author={Zhang, Quanjun and Fang, Chunrong and Xie, Yang and Ma, Yuxiang and Sun, Weisong and Yang, Yun and Chen, Zhenyu},
  journal={arXiv preprint arXiv:2405.01466}
  year={2024}
}

đźš—Todo List

  • [ ] add SE agent-based studies for GitHub Issues
  • [ ] add ISSTA 2024 Papers

🔥🔥 New Papers

  1. 🔥Exploring and Lifting the Robustness of LLM-powered Automated Program Repair with Metamorphic Testing[2024-arXiv] [paper]
  2. Divide-and-Conquer: Automating Code Revisions via Localization-and-Revision [2024-TOSEM]
  3. From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging [2024-arXiv] [paper] [repo]
  4. Automated Program Repair for Introductory Programming Assignments [2024-TLT] [paper]
  5. Automated Repair of AI Code with Large Language Models and Formal Verification [2024-arXiv] [paper]
  6. CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair [2024-arXiv-NVIDIA] [paper]
  7. Benchmarking Automated Program Repair: An Extensive Study on Both Real-World and Artificial Bugs [2024-ISSTA] [paper]
  8. Automated program repair via conversation: Fixing 162 out of 337 bugs for $0.42 each using chatgpt[2024-ISSTA] [paper]
  9. Leveraging Large Language Model for Automatic Patch Correctness Assessment[2024-TSE] [paper]
  10. Automated program repair for variability bugs in software product line systems[2024-JSS] [paper]
  11. PyBugHive: A Comprehensive Database of Manually Validated, Reproducible Python Bugs[2024-IEEE Access] [paper]

đź’ˇ Repair Scenarios

Semantic Bug

  1. 🔥Automated program repair for variability bugs in software product line systems[2024-JSS] [paper]
  2. 🔥A Unified Debugging Approach via LLM-Based Multi-Agent Synergy [2024-arxiv] [paper] [repo]
  3. 🔥How Far Can We Go with Practical Function-Level Program Repair? [2024-arxiv] [paper] [repo]
  4. 🔥Automated program repair via conversation: Fixing 162 out of 337 bugs for $0.42 each using chatgpt[2024-ISSTA] [paper]
    Old Version: Keep the Conversation Going: Fixing 162 out of 337 bugs for $0.42 each using ChatGPT [2023-arxiv] [paper]
  5. A Novel Approach for Automatic Program Repair using Round-Trip Translation with Large Language Models [2024-arxiv] [paper] [repo]
  6. Out of Context: How important is Local Context in Neural Program Repair? [2024-ICSE] [paper] [repo]
  7. Multi-Objective Fine-Tuning for Enhanced Program Repair with LLMs [2024-arxiv] [paper]
  8. Aligning LLMs for FL-free Program Repair [2024-arxiv] [paper]
  9. ContrastRepair: Enhancing Conversation-Based Automated Program Repair via Contrastive Test Case Pairs [2024-arxiv] [paper]
  10. Exploring the Potential of Pre-Trained Language Models of Code for Automated Program Repair [2024-Electronics] [paper]
  11. CigaR: Cost-efficient Program Repair with LLMs [2024-arxiv] [paper] [repo]
  12. The Fact Selection Problem in LLM-Based Program Repair [2024-arxiv] [paper] [repo]
  13. A Novel Approach for Automated Program Repair using Round-Trip Translation with Large Language Models [2024-arxiv] [paper] [repo]
  14. RepairAgent: An Autonomous, LLM-Based Agent for Program Repair [2024-arxiv] [paper]
  15. A Deep Dive into Large Language Models for Automated Bug Localization and Repair [2024-FSE/ESEC] [paper]
  16. Automated Program Repair in the Era of Large Pre-trained Language Models [2023-ICSE] [paper] [repo]
  17. Repair Is Nearly Generation: Multilingual Program Repair with LLMs [2023-AAAI] [paper]
  18. Retrieval-based prompt selection for code-related few-shot learning [2023-ICSE] [paper] [repo]
  19. What makes good in-context demonstrations for code intelligence tasks with llms? [2023-ASE] [paper] [repo]
  20. Fully Autonomous Programming with Large Language Models [2023-GECCO] [paper] [repo]
  21. Automated Program Repair Using Generative Models for Code Infilling [2023-AIED] [paper] [repo]
  22. STEAM: Simulating the InTeractive BEhavior of ProgrAMmers for Automatic Bug Fixing [2023-arxiv] [paper]
  23. Conversational automated program repair [2023-arxiv] [paper]
  24. Is ChatGPT the Ultimate Programming Assistant--How far is it? [2023-arxiv] [paper] [repo]
  25. Using Large Language Models for Bug Localization and Fixing [2023-iCAST] [paper]
  26. An Empirical Study on Fine-Tuning Large Language Models of Code for Automated Program Repair [2023-ASE] [paper] [repo]
  27. An Evaluation of the Effectiveness of OpenAI's ChatGPT for Automated Python Program Bug Fixing using QuixBugs [2023-iSEMANTIC] [paper]
  28. Explainable Automated Debugging via Large Language Model-driven Scientific Debugging [2023-arxiv] [paper]
  29. The Right Prompts for the Job: Repair Code-Review Defects with Large Language Model [2023-arxiv] [paper]
  30. Impact of Code Language Models on Automated Program Repair [2023-ICSE] [paper] [repo]
  31. Towards Generating Functionally Correct Code Edits from Natural Language Issue Descriptions [2023-arxiv] [paper]
  32. The Plastic Surgery Hypothesis in the Era of Large Language Models [2023-ASE] [paper] [repo]
  33. Exploring the Limits of ChatGPT in Software Security Applications [2023-arxiv] [paper]
  34. CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation [2023-arxiv] [paper] [repo]
  35. Enhancing Automated Program Repair through Fine-tuning and Prompt Engineering [2023-arxiv] [paper] [repo]
  36. Training Language Models for Programming Feedback Using Automated Repair Tools [2023-AIED] [paper] [repo]
  37. RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair [2023-arxiv] [paper] [repo]
  38. Automated Code Editing with Search-Generate-Modify [2023-arxiv] [paper] [repo]
  39. RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair [2023-FSE/ESEC] [paper] [repo]
  40. Neural Program Repair with Program Dependence Analysis and Effective Filter Mechanism [2023-arxiv] [paper]
  41. Coffee: Boost Your Code LLMs by Fixing Bugs with Feedback [2023-arxiv] [paper] [repo]
  42. A study on Prompt Design, Advantages and Limitations of ChatGPT for Deep Learning Program Repair [2023-arxiv] [paper]
  43. Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair [2023-FSE/ESEC] [paper] [repo]
  44. Gamma: Revisiting Template-Based Automated Program Repair Via Mask Prediction [2023-ASE] [paper] [repo]
  45. An Extensive Study on Model Architecture and Program Representation in the Domain of Learning-based Automated Program Repair [2023-APR] [paper] [repo]
  46. Improving Automated Program Repair with Domain Adaptation [2023-TOSEM] [paper] [repo]
  47. Enhancing Code Language Models for Program Repair by Curricular Fine-tuning Framework [2023-ICSME] [paper]
  48. The potential use of ChatGPT for debugging and bug fixing [2023-] [paper]
  49. CIRCLE: Continual Repair across Programming Languages [2022-ISSTA] [paper] [repo]
  50. Towards JavaScript program repair with Generative Pre-trained Transformer (GPT-2) [2022-APR] [paper] [repo]
  51. Fix Bugs with Transformer through a Neural-Symbolic Edit Grammar [2022-ICLR] [paper]
  52. Patch Generation with Language Models: Feasibility and Scaling Behavior [2022-ICLR] [paper]
  53. Can OpenAI's codex fix bugs?: an evaluation on QuixBugs [2022-APR] [paper]
  54. An Analysis of the Automatic Bug Fixing Performance of ChatGPT [2022-APR] [paper] [repo]
  55. Less training, more repairing please: revisiting automated program repair via zero-shot learning [2022-FSE/ESEC] [paer] [repo]
  56. Framing Program Repair as Code Completion [2022-APR] [paper] [repo]
  57. DEAR A Novel Deep Learning-based Approach for Automated Program Repair [2022-ICSE] [paper] [repo]
  58. Generating Bug-Fixes Using Pretrained Transformers [2021-PLDI] [paper]
  59. Applying CodeBERT for Automated Program Repair of Java Simple Bugs [2021-MSR] [paper] [repo]
  60. CURE Code-Aware Neural Machine Translation for Automatic Program Repair [2021-ICSE] [paper] [repo]

Security Vulnerability

  1. 🔥Automated Repair of AI Code with Large Language Models and Formal Verification [2024-arXiv] [paper]

  2. 🔥NAVRepair: Node-type Aware C/C++ Code Vulnerability Repair [2024-arxiv] [paper]

  3. Enhanced Automated Code Vulnerability Repair using Large Language Models [2024-arxiv] [paper]

  4. Out of Sight, Out of Mind: Better Automatic Vulnerability Repair by Broadening Input Ranges and Sources [2024-ICSE] [paper] [repo]

  5. A Study of Vulnerability Repair in JavaScript Programs with Large Language Models [2024-arxiv] [paper] [repo]

  6. Chain-of-Thought Prompting of Large Language Models for Discovering and Fixing Software Vulnerabilities [2024-arxiv] [paper]

  7. Pre-trained Model-based Automated Software Vulnerability Repair: How Far are We? [2023-TDSC] [paper] [repo]

  8. Examining zero-shot vulnerability repair with large language models [2023-S&P] [paper] [repo]

  9. An Empirical Study on Fine-Tuning Large Language Models of Code for Automated Program Repair [2023-ASE] [paper] [repo]

  10. A New Era in Software Security: Towards Self-Healing Software via Large Language Models and Formal Verification [2023-arxiv] [paper]

  11. Exploring the Limits of ChatGPT in Software Security Applications [2023-arxiv] [paper]

  12. ZeroLeak: Using LLMs for Scalable and Cost Effective Side-Channel Patching [2023-arxiv] [paper]

  13. How ChatGPT is Solving Vulnerability Management Problem [2023-arxiv] [paper] [repo]

  14. How Effective Are Neural Networks for Fixing Security Vulnerabilities [2023-ISSTA] [paper] [repo]

  15. Vision Transformer-Inspired Automated Vulnerability Repair [2023-TOSEM] [paper] [repo]

  16. Can large language models find and fix vulnerable software? [2023-arxiv] [paper]

  17. VulRepair: A T5-Based Automated Software Vulnerability Repair [2022-FSE/ESEC] [paper] [repo]

Syntax Error

  1. A Novel Approach for Automated Program Repair using Round-Trip Translation with Large Language Models [2024-arxiv] [paper] [repo]
  2. Repair Is Nearly Generation: Multilingual Program Repair with LLMs [2023-AAAI] [paper]
  3. Fixing Rust Compilation Errors using LLMs [2023-arxiv] [paper]
  4. An Empirical Study on Fine-Tuning Large Language Models of Code for Automated Program Repair [2023-ASE] [paper] [repo]
  5. A Chain of AI-based Solutions for Resolving FQNs and Fixing Syntax Errors in Partial Code [2023-arxiv] [paper] [repo]
  6. The Right Prompts for the Job: Repair Code-Review Defects with Large Language Model [2023-arxiv] [paper]
  7. SYNSHINE: improved fixing of Syntax Errors [2022-TSE] [paper] [repo]

Programming Problem

  1. 🔥CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair [2024-arXiv-NVIDIA] [paper]
  2. A Unified Debugging Approach via LLM-Based Multi-Agent Synergy [2024-arXiv] [paper] [repo]
  3. PyDex: Repairing Bugs in Introductory Python Assignments using LLMs [2024-OOPSLA] [paper] [repo]
  4. DebugBench: Evaluating Debugging Capability of Large Language Models [2024-arxiv] [paper] [repo]
  5. ContrastRepair: Enhancing Conversation-Based Automated Program Repair via Contrastive Test Case Pairs [2024-arxiv] [paper]
  6. ConDefects: A New Dataset to Address the Data Leakage Concern for LLM-based Fault Localization and Program Repair [2024-arxiv] [paper] [repo]
  7. Peer-aided Repairer: Empowering Large Language Models to Repair Advanced Student Assignments [2024-arxiv] [paper]
  8. Improved Program Repair Methods using Refactoring with GPT Models [2024-SIGCSE TS] [paper] [repo]
  9. A critical review of large language model on software engineering: An example from chatgpt and automated program repair [2023-arxiv] [paper] [repo]
  10. Automated Repair of Programs from Large Language Models [2023-ICSE] [paper] [repo]
  11. FixEval: Execution-based Evaluation of Program Fixes for Programming Problems [2023-APR] [paper] [repo]
  12. Refining ChatGPT-Generated Code: Characterizing and Mitigating Code Quality Issues [2023-TOSEM] [paper] [repo]
  13. Repairing bugs in python assignments using large language models [2022-arixv] [paper]

Static Warning

  1. Frustrated with Code Quality Issues? LLMs can Help! [2024-FSE/ESEC] [paper] [repo]
  2. SkipAnalyzer: An Embodied Agent for Code Analysis with Large Language Models [2023-arxiv] [paper] [repo]
  3. RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair [2023-FSE/ESEC] [paper] [repo]
  4. InferFix: End-to-End Program Repair with LLMs over Retrieval-Augmented Prompts [2023-FSE/ESEC] [paper] [repo]
  5. Can LLMs Patch Security Issues [2023-arxiv] [paper] [repo]
  6. Improving Automated Program Repair with Domain Adaptation [2023-TOSEM] [paper] [repo]
  7. An empirical study of deep transfer learning-based program repair for Kotlin projects [2022-FSE/ESEC] [paper]
  8. TFix-Learning to Fix Coding Errors with a Text-to-Text Transformer [2021-PMLR] [paper] [repo]

Self-Debug

  1. From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging [2024-arXiv] [paper] [repo]
  2. Teaching Large Language Models to Self-Debug [2024-ICLR] [paper]
  3. OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement [2024-arxiv] [paper] [repo]
  4. CYCLE: Learning to Self-Refine the Code Generation [2024-OOPSLA] [paper] [repo]
  5. LDB: A Large Language Model Debugger via Verifying Runtime Execution Step by Step [2024-arxiv] [paper] [repo]
  6. Leveraging Print Debugging to Improve Code Generation in Large Language Models [2024-arxiv] [paper]
  7. SelfEvolve: A Code Evolution Framework via Large Language Models [2023-arxiv] [paper]
  8. Self-Refine: Iterative Refinement with Self-Feedback [2023-NeurIPS] [paper] [repo]
  9. AgentCoder: Multi Agent-Code Generation with Iterative Testing and Optimisation [2023-arxiv] [paper]
  10. Self-Edit: Fault-Aware Code Editor for Code Generation [2023-ACL] [paper] [repo]
  11. Is Self-Repair a Silver Bullet for Code Generation? [2023-ICLR] [paper] [repo]

Type Error

  1. Domain Knowledge Matters: Improving Prompts with Fix Templates for Repairing Python Type Errors [2024-ICSE] [paper] [repo]
  2. PyTy: Repairing Static Type Errors in Python [2024-ICSE] [paper] [repo]
  3. GPT-3-Powered Type Error Debugging: Investigating the Use of Large Language Models for Code Repair [2023-SLE] [paper] [repo]

Web UI Test

  1. Guiding ChatGPT to Fix Web UI Tests via Explanation-Consistency Checking [2023-arxiv] [paper]

Smart Contract

  1. ACFIX: Guiding LLMs with Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts [2024-arxiv] [paper]
  2. Evaluating ChatGPT for Smart Contracts Vulnerability Correction [2023-COMPSAC] [paper] [repo]

Hardware Bug

  1. On Hardware Security Bug Code Fixes By Prompting Large Language Models [2024-TIFS] [paper] [repo]
    Its pre-print: Fixing Hardware Security Bugs with Large Language Models [2022-arXiv] [paper]
  2. HDLdebugger: Streamlining HDL debugging with Large Language Models [2024-arxiv] [paper]
  3. RTLFixer: Automatically Fixing RTL Syntax Errors with Large Language Models [2023-arxiv] [paper]
  4. LLM4SecHW: Leveraging domain-specific large language model for hardware debugging [2023-AsianHOST] [paper]

Performance Bug

  1. RAPGen: An Approach for Fixing Code Inefficiencies in Zero-Shot [2023-arxiv] [paper]
  2. DeepDev-PERF: A Deep Learning-Based Approach for Improving Software Performance [2022-FSE/ESEC] [paper] [repo]

API Misuse

  1. Evaluating Pre-trained Language Models for Repairing API Misuses [2023-arxiv] [paper] [repo]

Crash Bug

  1. Resolving Crash Bugs via Large Language Models: An Empirical Study [2023-arxiv] [paper] [repo]

Test Case

  1. Automated Test Case Repair Using Language Models [2024-arxiv] [paper]
  2. Identify and Update Test Cases when Production Code Changes: A Transformer-based Approach [2023-ASE]

Formal Proof

  1. Baldur: Whole-Proof Generation and Repair with Large Language Models [2023-FSE/ESEC] [paper]

Translation Bug

  1. Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code [2024-ICSE] [paper] [repo]

GitHub Issue

  1. SWE-bench: Can Language Models Resolve Real-World GitHub Issues? [2024-ICLR] [paper] [repo]

Code Review

  1. Exploring the Potential of ChatGPT in Automated Code Refinement: An Empirical Study [2024-ICSE] [paper] [repo]

Motion Planner

  1. DrPlanner: Diagnosis and Repair of Motion Planners Using Large Language Models [2024-arxiv] [paper] [repo]

🙆 Human Study

  1. Exploring Experiences with Automated Program Repair in Practice [2024-ICSE] [paper]
  2. Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models [2024-arxiv] [papper] [repo]
  3. An Empirical Study of Adoption of ChatGPT for Bug Fixing among Professional Developers [2023-ITA] [paper]

đź™… Patch Correctness Assessment

  1. 🔥Leveraging Large Language Model for Automatic Patch Correctness Assessment[2024-TSE] [paper]
  2. APPT Boosting Automated Patch Correctness Prediction via Pre-trained Language Model [2024-TSE] [paper] [repo]
  3. The Best of Both Worlds: Combining Learned Embeddings with Engineered Features for Accurate Prediction of Correct Patches [2023-TOSME] [paper] [repo]
  4. Invalidator: Automated Patch Correctness Assessment via Semantic and Syntactic Reasoning [2023-TSE] [paper] [repo]
  5. PatchZero: Zero-Shot Automatic Patch Correctness Assessment [2023-arxiv] [paper]
  6. Is this Change the Answer to that Problem? Correlating Descriptions of Bug and Code Changes for Evaluating Patch Correctness [2021-ASE] [paper] [repo]
  7. Evaluating representation learning of code changes for predicting patch correctness in program repair [2020-ASE] [paper] [repo]

đź“Š Benchmark

  1. 🔥MuBench: Benchmarking Automated Program Repair: An Extensive Study on Both Real-World and Artificial Bugs [2024-ISSTA] [paper]
  2. CodeEditorBench: Evaluating Code Editing Capability of Large Language Models [2024-arxiv] [paper] [repo]
  3. GitBug-Java: A Reproducible Benchmark of Recent Java Bugs [2024-arxiv] [paper] [repo]
  4. SWE-bench: Can Language Models Resolve Real-World GitHub Issues? [2024-ICLR] [paper] [repo]
  5. DebugBench: Evaluating Debugging Capability of Large Language Models [2024-arxiv] [paper] [repo]
  6. ConDefects: A New Dataset to Address the Data Leakage Concern for LLM-based Fault Localization and Program Repair [2024-arxiv] [paper] [repo]
  7. A critical review of large language model on software engineering: An example from chatgpt and automated program repair [2023-arxiv] [paper] [repo]
  8. CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation [2023-arxiv] [paper] [repo]
  9. FixEval: Execution-based Evaluation of Program Fixes for Programming Problems [2023-APR] [paper] [repo]

🤔 Related APR Surveys

  1. A Survey of Learning-based Automated Program Repair [2023-TOSEM] [paper] [repo]
  2. Automatic Software Repair: A Bibliography [2018-CSUR] paper]
  3. Automatic Software Repair: A Survey [2017-TSE] paper]

Star History

Star History Chart

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for AwesomeLLM4APR

Similar Open Source Tools

For similar tasks

For similar jobs