
Awesome-Repo-Level-Code-Generation
Must-read papers on Repository-level Code Generation & Issue Resolution π₯
Stars: 163

This repository contains a collection of tools and scripts for generating code at the repository level. It provides a set of utilities to automate the process of creating and managing code across multiple files and directories. The tools included in this repository aim to improve code generation efficiency and maintainability by streamlining the development workflow. With a focus on enhancing productivity and reducing manual effort, this collection offers a variety of code generation options and customization features to suit different project requirements.
README:
π A curated list of awesome repository-level code generation research papers and resources. If you want to contribute to this list (please do), feel free to send me a pull request. π If you have any further questions, feel free to contact Yuling Shi or Xiaodong Gu (SJTU).
- π Contents
- π₯ Repo-Level Issue Resolution
- π€ Repo-Level Code Completion
- π Repo-Level Code Translation
- π§ͺ Repo-Level Unit Test Generation
- π Repo-Level Code QA
- π©βπ» Repo-Level Issue Task Synthesis
- π Datasets and Benchmarks
-
SWE-Exp: Experience-Driven Software Issue Resolution [2025-07-arXiv] [π paper] [π repo]
-
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution [2025-07-arXiv] [π paper] [π repo]
-
SWE-Effi: Re-Evaluating Software AI Agent System Effectiveness Under Resource Constraints [2025-09-arXiv] [π paper]
-
Diffusion is a code repair operator and generator [2025-08-arXiv] [π paper]
-
The SWE-Bench Illusion: When State-of-the-Art LLMs Remember Instead of Reason [2025-06-arXiv] [π paper]
-
Agent-RLVR: Training Software Engineering Agents via Guidance and Environment Rewards [2025-06-arXiv] [π paper]
-
EXPEREPAIR: Dual-Memory Enhanced LLM-based Repository-Level Program Repair [2025-06-arXiv] [π paper]
-
Coding Agents with Multimodal Browsing are Generalist Problem Solvers [2025-06-arXiv] [π paper] [π repo]
-
CoRet: Improved Retriever for Code Editing [2025-05-arXiv] [π paper]
-
Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents [2025-05-arXiv] [π paper] [π repo]
-
SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development [2025-05-arXiv] [π paper] [π repo]
-
Putting It All into Context: Simplifying Agents with LCLMs [2025-05-arXiv] [π paper]
-
SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning [2025-05-arXiv] [π blog] [π repo]
-
AEGIS: An Agent-based Framework for General Bug Reproduction from Issue Descriptions [2025-FSE] [π paper]
-
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [2025-03-arXiv] [π paper] [π repo]
-
Enhancing Repository-Level Software Repair via Repository-Aware Knowledge Graphs [2025-03-arXiv] [π paper]
-
CoSIL: Software Issue Localization via LLM-Driven Code Repository Graph Searching [2025-03-arXiv] [π paper]
-
SEAlign: Alignment Training for Software Engineering Agent [2025-03-arXiv] [π paper]
-
DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal [2025-03-arXiv] [π paper] [π repo]
-
LocAgent: Graph-Guided LLM Agents for Code Localization [2025-03-arXiv] [π paper] [π repo]
-
SoRFT: Issue Resolving with Subtask-oriented Reinforced Fine-Tuning [2025-02-arXiv] [π paper]
-
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution [2025-02-arXiv] [π paper] [π repo]
-
SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution [2025-01-arXiv] [π paper] [π repo]
-
CodeMonkeys: Scaling Test-Time Compute for Software Engineering [2025-01-arXiv] [π paper] [π repo]
-
Training Software Engineering Agents and Verifiers with SWE-Gym [2024-12-arXiv] [π paper] [π repo]
-
CODEV: Issue Resolving with Visual Data [2024-12-arXiv] [π paper] [π repo]
-
LLMs as Continuous Learners: Improving the Reproduction of Defective Code in Software Issues [2024-11-arXiv] [π paper]
-
Globant Code Fixer Agent Whitepaper [2024-11] [π paper]
-
MarsCode Agent: AI-native Automated Bug Fixing [2024-11-arXiv] [π paper]
-
Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement [2024-11-arXiv] [π paper] [π repo]
-
SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement [2024-10-arXiv] [π paper] [π repo]
-
AutoCodeRover: Autonomous Program Improvement [2024-09-ISSTA] [π paper] [π repo]
-
SpecRover: Code Intent Extraction via LLMs [2024-08-arXiv] [π paper]
-
OpenHands: An Open Platform for AI Software Developers as Generalist Agents [2024-07-arXiv] [π paper] [π repo]
-
AGENTLESS: Demystifying LLM-based Software Engineering Agents [2024-07-arXiv] [π paper]
-
RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph [2024-07-arXiv] [π paper] [π repo]
-
CodeR: Issue Resolving with Multi-Agent and Task Graphs [2024-06-arXiv] [π paper] [π repo]
-
Alibaba LingmaAgent: Improving Automated Issue Resolution via Comprehensive Repository Exploration [2024-06-arXiv] [π paper]
-
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering [2024-NeurIPS] [π paper] [π repo]
-
Enhancing Project-Specific Code Completion by Inferring Internal API Information [2025-07-TSE] [π paper] [π repo]
-
CodeRAG: Supportive Code Retrieval on Bigraph for Real-World Code Generation [2025-04-arXiv] [π paper]
-
CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases [2025-04-NAACL] [π paper]
-
RTLRepoCoder: Repository-Level RTL Code Completion through the Combination of Fine-Tuning and Retrieval Augmentation [2025-04-arXiv] [π paper]
-
Hierarchical Context Pruning: Optimizing Real-World Code Completion with Repository-Level Pretrained Code LLMs [2025-04-AAAI] [π paper] [π repo]
-
What to Retrieve for Effective Retrieval-Augmented Code Generation? An Empirical Study and Beyond [2025-03-arXiv] [π paper]
-
REPOFILTER: Adaptive Retrieval Context Trimming for Repository-Level Code Completion [2025-04-OpenReview] [π paper]
-
Improving FIM Code Completions via Context & Curriculum Based Learning [2024-12-arXiv] [π paper]
-
ContextModule: Improving Code Completion via Repository-level Contextual Information [2024-12-arXiv] [π paper]
-
A^3-CodGen: A Repository-Level Code Generation Framework for Code Reuse With Local-Aware, Global-Aware, and Third-Party-Library-Aware [2024-12-TSE] [π paper]
-
RepoGenReflex: Enhancing Repository-Level Code Completion with Verbal Reinforcement and Retrieval-Augmented Generation [2024-09-arXiv] [π paper]
-
RAMBO: Enhancing RAG-based Repository-Level Method Body Completion [2024-09-arXiv] [π paper] [π repo]
-
RLCoder: Reinforcement Learning for Repository-Level Code Completion [2024-07-arXiv] [π paper] [π repo]
-
STALL+: Boosting LLM-based Repository-level Code Completion with Static Analysis [2024-06-arXiv] [π paper]
-
GraphCoder: Enhancing Repository-Level Code Completion via Code Context Graph-based Retrieval and Language Model [2024-06-arXiv] [π paper]
-
Enhancing Repository-Level Code Generation with Integrated Contextual Information [2024-06-arXiv] [π paper]
-
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models [2024-06-arXiv] [π paper]
-
Natural Language to Class-level Code Generation by Iterative Tool-augmented Reasoning over Repository [2024-05-arXiv] [π paper] [π repo]
-
Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback [2024-03-arXiv] [π paper] [π repo]
-
Repoformer: Selective Retrieval for Repository-Level Code Completion [2024-03-arXiv] [π paper] [π repo]
-
RepoHyper: Search-Expand-Refine on Semantic Graphs for Repository-Level Code Completion [2024-03-arXiv] [π paper] [π repo]
-
RepoMinCoder: Improving Repository-Level Code Generation Based on Information Loss Screening [2024-07-Internetware] [π paper]
-
CodePlan: Repository-Level Coding using LLMs and Planning [2024-07-FSE] [π paper] [π repo]
-
DraCo: Dataflow-Guided Retrieval Augmentation for Repository-Level Code Completion [2024-05-ACL] [π paper] [π repo]
-
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation [2023-10-EMNLP] [π paper] [π repo]
-
Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context [2023-09-NeurIPS] [π paper] [π repo]
-
RepoFusion: Training Code Models to Understand Your Repository [2023-06-arXiv] [π paper] [π repo]
-
Repository-Level Prompt Generation for Large Language Models of Code [2023-06-ICML] [π paper] [π repo]
-
Fully Autonomous Programming with Large Language Models [2023-06-GECCO] [π paper] [π repo]
-
A Systematic Literature Review on Neural Code Translation [2025-05-arXiv] [π paper]
-
EVOC2RUST: A Skeleton-guided Framework for Project-Level C-to-Rust Translation [2025-08-arXiv] [π paper]
-
Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code [2024-04-ICSE] [π paper] [π repo]
-
C2SaferRust: Transforming C Projects into Safer Rust with NeuroSymbolic Techniques [2025-01-arXiv] [π paper] [π repo]
-
Scalable, Validated Code Translation of Entire Projects using Large Language Models [2025-06-PLDI] [π paper]
-
Syzygy: Dual Code-Test C to (safe) Rust Translation using LLMs and Dynamic Analysis [2024-12-arxiv] [π paper] [πΈοΈ website]
-
Execution-Feedback Driven Test Generation from SWE Issues [2025-08-arXiv] [π paper]
-
AssertFlip: Reproducing Bugs via Inversion of LLM-Generated Passing Tests [2025-07-arXiv] [π paper]
-
Issue2Test: Generating Reproducing Test Cases from Issue Reports [2025-03-arXiv] [π paper]
-
Agentic Bug Reproduction for Effective Automated Program Repair at Google [2025-02-arXiv] [π paper]
-
LLMs as Continuous Learners: Improving the Reproduction of Defective Code in Software Issues [2024-11-arXiv] [π paper]
-
SWE-QA: Can Language Models Answer Repository-level Code Questions? [2025-09-arXiv] [π paper] [π repo]
-
Decompositional Reasoning for Graph Retrieval with Large Language Models [2025-06-arXiv] [π paper]
-
LongCodeBench: Evaluating Coding LLMs at 1M Context Windows [2025-05-arXiv] [π paper]
-
LocAgent: Graph-Guided LLM Agents for Code Localization [2025-03-arXiv] [π paper] [π repo]
-
CoReQA: Uncovering Potentials of Language Models in Code Repository Question Answering [2025-01-arXiv] [π paper]
-
RepoChat Arena [2025-Blog] [π repo]
-
RepoChat: An LLM-Powered Chatbot for GitHub Repository Question-Answering [MSR-2025] [π repo]
-
CodeQueries: A Dataset of Semantic Queries over Code [2022-09-arXiv] [π paper]
-
SWE-Mirror: Scaling Issue-Resolving Datasets by Mirroring Issues Across Repositories [2025-09-arXiv] [π paper]
-
R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents [2025-04-arXiv] [π paper] [π repo]
-
SWE-bench Goes Live! [2025-05-arXiv] [π paper] [π repo]
-
Scaling Data for Software Engineering Agents [2025-04-arXiv] [π paper] [π repo]
-
Synthesizing Verifiable Bug-Fix Data to Enable Large Language Models in Resolving Real-World Bugs [2025-04-arXiv] [π paper] [π repo]
-
Training Software Engineering Agents and Verifiers with SWE-Gym [2024-12-arXiv] [π paper] [π repo]
-
SWE-QA: Can Language Models Answer Repository-level Code Questions? [2025-09-arXiv] [π paper] [π repo]
-
SWE-bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks? [2025-09] [π paper] [π repo]
-
AutoCodeBench: AutoCodeBench: Large Language Models are Automatic Code Benchmark Generators [2025-08-arXiv] [π paper] [π repo]
-
LiveRepoReflection: Turning the Tide: Repository-based Code Reflection [2025-07-arXiv] [π paper] [π repo]
-
SWE-Perf: SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories? [2025-07-arXiv] [π paper] [π repo]
-
ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code [2025-06-arXiv] [π paper] [π repo]
-
SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks [2025-06-arXiv] [π paper] [π repo]
-
UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench [ACL-2025] [π paper]
-
SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven Manner [ICML-2025] [π paper] [π repo]
-
AgentIssue-Bench: Can Agents Fix Agent Issues? [2025-08-arXiv] [π paper] [π repo]
-
OmniGIRL: OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution [2025-05-arXiv] [π paper] [π repo]
-
SWE-Smith: Scaling Data for Software Engineering Agents [2025-04-arXiv] [π paper] [π repo]
-
SWE-Synth: Synthesizing Verifiable Bug-Fix Data to Enable Large Language Models in Resolving Real-World Bugs [2025-04-arXiv] [π paper] [π repo]
-
Are "Solved Issues" in SWE-bench Really Solved Correctly? An Empirical Study [2025-03-arXiv] [π paper]
-
Unveiling Pitfalls: Understanding Why AI-driven Code Agents Fail at GitHub Issue Resolution [2025-03-arXiv] [π paper]
-
SWE-Lancer: Can Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? [2025-arXiv] [π paper] [π repo]
-
Evaluating Agent-based Program Repair at Google [2025-01-arXiv] [π paper]
-
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents [2025-05-arXiv] [π paper] [πΈοΈ website]
-
SWE-bench-Live: A Live Benchmark for Repository-Level Issue Resolution [2025-05-arXiv] [π paper] [π repo]
-
FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation [2025-05-ACL] [π paper] [π repo]
-
OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution [2025-05-ISSTA] [π paper]
-
SWE-PolyBench: A multi-language benchmark for repository level evaluation of coding agents [2025-04-arXiv] [π paper] [π repo]
-
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving [2025-04-arXiv] [π paper] [π repo]
-
LibEvolutionEval: A Benchmark and Study for Version-Specific Code Generation [2025-04-NAACL] [π paper][π Website]
-
SWEE-Bench & SWA-Bench: Automated Benchmark Generation for Repository-Level Coding Tasks [2025-03-arXiv] [π paper]
-
ProjectEval: A Benchmark for Programming Agents Automated Evaluation on Project-Level Code Generation [2025-03-arXiv] [π paper]
-
REPOST-TRAIN: Scalable Repository-Level Coding Environment Construction with Sandbox Testing [2025-03-arXiv] [π paper] [π repo]
-
Loc-Bench: Graph-Guided LLM Agents for Code Localization [2025-03-arXiv] [π paper] [π repo]
-
SWE-Lancer: Can Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? [2025-02-arXiv] [π paper] [π repo]
-
SolEval: Benchmarking Large Language Models for Repository-level Solidity Code Generation [2025-02-arXiv] [π paper] [π repo]
-
HumanEvo: An Evolution-aware Benchmark for More Realistic Evaluation of Repository-level Code Generation [2025-ICSE] [π paper] [π repo]
-
RepoExec: On the Impacts of Contexts on Repository-Level Code Generation [2025-NAACL] [π paper] [π repo]
-
SWE-Gym: Training Software Engineering Agents and Verifiers with SWE-Gym [2024-12-arXiv] [π paper] [π repo]
-
RepoTransBench: RepoTransBench: A Real-World Benchmark for Repository-Level Code Translation [2024-12-arXiv] [π paper] [π repo]
-
Visual SWE-bench: Issue Resolving with Visual Data [2024-12-arXiv] [π paper] [π repo]
-
ExecRepoBench: Multi-level Executable Code Completion Evaluation [2024-12-arXiv] [π paper] [π site]
-
REPOCOD: Can Language Models Replace Programmers? REPOCOD Says 'Not Yet' [2024-10-arXiv] [π paper] [π repo]
-
M2RC-EVAL: M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation [2024-10-arXiv] [π paper] [π repo]
-
SWE-bench+: Enhanced Coding Benchmark for LLMs [2024-10-arXiv] [π paper]
-
SWE-bench Multimodal: Multimodal Software Engineering Benchmark [2024-10-arXiv] [π paper] [π site]
-
Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [2024-10-arXiv] [π paper] [π repo]
-
SWT-Bench: Testing and Validating Real-World Bug-Fixes with Code Agents [2024-06-arxiv] [π paper] [πΈοΈ website]
-
CodeRAG-Bench: Can Retrieval Augment Code Generation? [2024-06-arXiv] [π paper] [π repo]
-
R2C2-Bench: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models [2024-06-arXiv] [π paper]
-
RepoClassBench: Class-Level Code Generation from Natural Language Using Iterative, Tool-Enhanced Reasoning over Repository [2024-05-arXiv] [π paper] [π repo]
-
DevEval: Evaluating Code Generation in Practical Software Projects [2024-ACL-Findings] [π paper] [π repo]
-
CodAgentBench: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges [2024-ACL] [π paper]
-
RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems [2024-ICLR] [π paper] [π repo]
-
SWE-bench: Can Language Models Resolve Real-World GitHub Issues? [2024-ICLR] [π paper] [π repo]
-
CrossCodeLongEval: Repoformer: Selective Retrieval for Repository-Level Code Completion [2024-ICML] [π paper] [π repo]
-
R2E-Eval: Turning Any GitHub Repository into a Programming Agent Test Environment [2024-ICML] [π paper] [π repo]
-
RepoEval: Repository-Level Code Completion Through Iterative Retrieval and Generation [2023-EMNLP] [π paper] [π repo]
-
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion [2023-NeurIPS] [π paper] [π site]
-
Skeleton-Guided-Translation: A Benchmarking Framework for Code Repository Translation with Fine-Grained Quality Evaluation [2025-01-arxiv] [π paper] [π repo]
-
SWE-Dev: SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development [2025-05-arXiv] [π paper] [π repo]
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for Awesome-Repo-Level-Code-Generation
Similar Open Source Tools

Awesome-Repo-Level-Code-Generation
This repository contains a collection of tools and scripts for generating code at the repository level. It provides a set of utilities to automate the process of creating and managing code across multiple files and directories. The tools included in this repository aim to improve code generation efficiency and maintainability by streamlining the development workflow. With a focus on enhancing productivity and reducing manual effort, this collection offers a variety of code generation options and customization features to suit different project requirements.

build-an-agentic-llm-assistant
This repository provides a hands-on workshop for developers and solution builders to build a real-life serverless LLM application using foundation models (FMs) through Amazon Bedrock and advanced design patterns such as Reason and Act (ReAct) Agent, text-to-SQL, and Retrieval Augmented Generation (RAG). It guides users through labs to explore common and advanced LLM application design patterns, helping them build a complex Agentic LLM assistant capable of answering retrieval and analytical questions on internal knowledge bases. The repository includes labs on IaC with AWS CDK, building serverless LLM assistants with AWS Lambda and Amazon Bedrock, refactoring LLM assistants into custom agents, extending agents with semantic retrieval, and querying SQL databases. Users need to set up AWS Cloud9, configure model access on Amazon Bedrock, and use Amazon SageMaker Studio environment to run data-pipelines notebooks.

Memento
Memento is a lightweight and user-friendly version control tool designed for small to medium-sized projects. It provides a simple and intuitive interface for managing project versions and collaborating with team members. With Memento, users can easily track changes, revert to previous versions, and merge different branches. The tool is suitable for developers, designers, content creators, and other professionals who need a streamlined version control solution. Memento simplifies the process of managing project history and ensures that team members are always working on the latest version of the project.

verl-tool
The verl-tool is a versatile command-line utility designed to streamline various tasks related to version control and code management. It provides a simple yet powerful interface for managing branches, merging changes, resolving conflicts, and more. With verl-tool, users can easily track changes, collaborate with team members, and ensure code quality throughout the development process. Whether you are a beginner or an experienced developer, verl-tool offers a seamless experience for version control operations.

SinkFinder
SinkFinder + LLM is a closed-source semi-automatic vulnerability discovery tool that performs static code analysis on jar/war/zip files. It enhances the capability of LLM large models to verify path reachability and assess the trustworthiness score of the path based on the contextual code environment. Users can customize class and jar exclusions, depth of recursive search, and other parameters through command-line arguments. The tool generates rule.json configuration file after each run and requires configuration of the DASHSCOPE_API_KEY for LLM capabilities. The tool provides detailed logs on high-risk paths, LLM results, and other findings. Rules.json file contains sink rules for various vulnerability types with severity levels and corresponding sink methods.

MCP-PostgreSQL-Ops
MCP-PostgreSQL-Ops is a repository containing scripts and tools for managing and optimizing PostgreSQL databases. It provides a set of utilities to automate common database administration tasks, such as backup and restore, performance tuning, and monitoring. The scripts are designed to simplify the operational aspects of running PostgreSQL databases, making it easier for administrators to maintain and optimize their database instances. With MCP-PostgreSQL-Ops, users can streamline their database management processes and improve the overall performance and reliability of their PostgreSQL deployments.

LLM-Viewer
LLM-Viewer is a tool for visualizing Language and Learning Models (LLMs) and analyzing performance on different hardware platforms. It enables network-wise analysis, considering factors such as peak memory consumption and total inference time cost. With LLM-Viewer, users can gain valuable insights into LLM inference and performance optimization. The tool can be used in a web browser or as a command line interface (CLI) for easy configuration and visualization. The ongoing project aims to enhance features like showing tensor shapes, expanding hardware platform compatibility, and supporting more LLMs with manual model graph configuration.

iree-amd-aie
This repository contains an early-phase IREE compiler and runtime plugin for interfacing the AMD AIE accelerator to IREE. It provides architectural overview, developer setup instructions, building guidelines, and runtime driver setup details. The repository focuses on enabling the integration of the AMD AIE accelerator with IREE, offering developers the tools and resources needed to build and run applications leveraging this technology.

openvino_build_deploy
The OpenVINO Build and Deploy repository provides pre-built components and code samples to accelerate the development and deployment of production-grade AI applications across various industries. With the OpenVINO Toolkit from Intel, users can enhance the capabilities of both Intel and non-Intel hardware to meet specific needs. The repository includes AI reference kits, interactive demos, workshops, and step-by-step instructions for building AI applications. Additional resources such as Jupyter notebooks and a Medium blog are also available. The repository is maintained by the AI Evangelist team at Intel, who provide guidance on real-world use cases for the OpenVINO toolkit.

Mastering-NLP-from-Foundations-to-LLMs
This code repository is for the book 'Mastering NLP from Foundations to LLMs', which provides an in-depth introduction to Natural Language Processing (NLP) techniques. It covers mathematical foundations of machine learning, advanced NLP applications such as large language models (LLMs) and AI applications, as well as practical skills for working on real-world NLP business problems. The book includes Python code samples and expert insights into current and future trends in NLP.

open-webui-tools
Open WebUI Tools Collection is a set of tools for structured planning, arXiv paper search, Hugging Face text-to-image generation, prompt enhancement, and multi-model conversations. It enhances LLM interactions with academic research, image generation, and conversation management. Tools include arXiv Search Tool and Hugging Face Image Generator. Function Pipes like Planner Agent offer autonomous plan generation and execution. Filters like Prompt Enhancer improve prompt quality. Installation and configuration instructions are provided for each tool and pipe.

sscs-chipathon-2025
SSCS-Chipathon-2025 is a GitHub repository containing code and resources for a hackathon event focused on developing innovative solutions using chip technology. The repository includes sample projects, documentation, and tools to help participants build and showcase their projects during the hackathon. Participants can collaborate, learn, and experiment with chip technology to create impactful and cutting-edge solutions. The repository aims to inspire creativity, foster collaboration, and drive innovation in the field of chip technology.

Main
This repository contains material related to the new book _Synthetic Data and Generative AI_ by the author, including code for NoGAN, DeepResampling, and NoGAN_Hellinger. NoGAN is a tabular data synthesizer that outperforms GenAI methods in terms of speed and results, utilizing state-of-the-art quality metrics. DeepResampling is a fast NoGAN based on resampling and Bayesian Models with hyperparameter auto-tuning. NoGAN_Hellinger combines NoGAN and DeepResampling with the Hellinger model evaluation metric.

RD-Agent
RD-Agent is a tool designed to automate critical aspects of industrial R&D processes, focusing on data-driven scenarios to streamline model and data development. It aims to propose new ideas ('R') and implement them ('D') automatically, leading to solutions of significant industrial value. The tool supports scenarios like Automated Quantitative Trading, Data Mining Agent, Research Copilot, and more, with a framework to push the boundaries of research in data science. Users can create a Conda environment, install the RDAgent package from PyPI, configure GPT model, and run various applications for tasks like quantitative trading, model evolution, medical prediction, and more. The tool is intended to enhance R&D processes and boost productivity in industrial settings.

xlstm-jax
The xLSTM-jax repository contains code for training and evaluating the xLSTM model on language modeling using JAX. xLSTM is a Recurrent Neural Network architecture that improves upon the original LSTM through Exponential Gating, normalization, stabilization techniques, and a Matrix Memory. It is optimized for large-scale distributed systems with performant triton kernels for faster training and inference.
For similar tasks

Awesome-Repo-Level-Code-Generation
This repository contains a collection of tools and scripts for generating code at the repository level. It provides a set of utilities to automate the process of creating and managing code across multiple files and directories. The tools included in this repository aim to improve code generation efficiency and maintainability by streamlining the development workflow. With a focus on enhancing productivity and reducing manual effort, this collection offers a variety of code generation options and customization features to suit different project requirements.

code-review-gpt
Code Review GPT uses Large Language Models to review code in your CI/CD pipeline. It helps streamline the code review process by providing feedback on code that may have issues or areas for improvement. It should pick up on common issues such as exposed secrets, slow or inefficient code, and unreadable code. It can also be run locally in your command line to review staged files. Code Review GPT is in alpha and should be used for fun only. It may provide useful feedback but please check any suggestions thoroughly.

shell_gpt
ShellGPT is a command-line productivity tool powered by AI large language models (LLMs). This command-line tool offers streamlined generation of shell commands, code snippets, documentation, eliminating the need for external resources (like Google search). Supports Linux, macOS, Windows and compatible with all major Shells like PowerShell, CMD, Bash, Zsh, etc.

syncode
SynCode is a novel framework for the grammar-guided generation of Large Language Models (LLMs) that ensures syntactically valid output with respect to defined Context-Free Grammar (CFG) rules. It supports general-purpose programming languages like Python, Go, SQL, JSON, and more, allowing users to define custom grammars using EBNF syntax. The tool compares favorably to other constrained decoders and offers features like fast grammar-guided generation, compatibility with HuggingFace Language Models, and the ability to work with various decoding strategies.

llm.nvim
llm.nvim is a plugin for Neovim that enables code completion using LLM models. It supports 'ghost-text' code completion similar to Copilot and allows users to choose their model for code generation via HTTP requests. The plugin interfaces with multiple backends like Hugging Face, Ollama, Open AI, and TGI, providing flexibility in model selection and configuration. Users can customize the behavior of suggestions, tokenization, and model parameters to enhance their coding experience. llm.nvim also includes commands for toggling auto-suggestions and manually requesting suggestions, making it a versatile tool for developers using Neovim.

DemoGPT
DemoGPT is an all-in-one agent library that provides tools, prompts, frameworks, and LLM models for streamlined agent development. It leverages GPT-3.5-turbo to generate LangChain code, creating interactive Streamlit applications. The tool is designed for creating intelligent, interactive, and inclusive solutions in LLM-based application development. It offers model flexibility, iterative development, and a commitment to user engagement. Future enhancements include integrating Gorilla for autonomous API usage and adding a publicly available database for refining the generation process.

CodeGen
CodeGen is an official release of models for Program Synthesis by Salesforce AI Research. It includes CodeGen1 and CodeGen2 models with varying parameters. The latest version, CodeGen2.5, outperforms previous models. The tool is designed for code generation tasks using large language models trained on programming and natural languages. Users can access the models through the Hugging Face Hub and utilize them for program synthesis and infill sampling. The accompanying Jaxformer library provides support for data pre-processing, training, and fine-tuning of the CodeGen models.

llm.hunyuan.T1
Hunyuan-T1 is a cutting-edge large-scale hybrid Mamba reasoning model driven by reinforcement learning. It has been officially released as an upgrade to the Hunyuan Thinker-1-Preview model. The model showcases exceptional performance in deep reasoning tasks, leveraging the TurboS base and Mamba architecture to enhance inference capabilities and align with human preferences. With a focus on reinforcement learning training, the model excels in various reasoning tasks across different domains, showcasing superior abilities in mathematical, logical, scientific, and coding reasoning. Through innovative training strategies and alignment with human preferences, Hunyuan-T1 demonstrates remarkable performance in public benchmarks and internal evaluations, positioning itself as a leading model in the field of reasoning.
For similar jobs

aiscript
AiScript is a lightweight scripting language that runs on JavaScript. It supports arrays, objects, and functions as first-class citizens, and is easy to write without the need for semicolons or commas. AiScript runs in a secure sandbox environment, preventing infinite loops from freezing the host. It also allows for easy provision of variables and functions from the host.

askui
AskUI is a reliable, automated end-to-end automation tool that only depends on what is shown on your screen instead of the technology or platform you are running on.

bots
The 'bots' repository is a collection of guides, tools, and example bots for programming bots to play video games. It provides resources on running bots live, installing the BotLab client, debugging bots, testing bots in simulated environments, and more. The repository also includes example bots for games like EVE Online, Tribal Wars 2, and Elvenar. Users can learn about developing bots for specific games, syntax of the Elm programming language, and tools for memory reading development. Additionally, there are guides on bot programming, contributing to BotLab, and exploring Elm syntax and core library.

ain
Ain is a terminal HTTP API client designed for scripting input and processing output via pipes. It allows flexible organization of APIs using files and folders, supports shell-scripts and executables for common tasks, handles url-encoding, and enables sharing the resulting curl, wget, or httpie command-line. Users can put things that change in environment variables or .env-files, and pipe the API output for further processing. Ain targets users who work with many APIs using a simple file format and uses curl, wget, or httpie to make the actual calls.

LaVague
LaVague is an open-source Large Action Model framework that uses advanced AI techniques to compile natural language instructions into browser automation code. It leverages Selenium or Playwright for browser actions. Users can interact with LaVague through an interactive Gradio interface to automate web interactions. The tool requires an OpenAI API key for default examples and offers a Playwright integration guide. Contributors can help by working on outlined tasks, submitting PRs, and engaging with the community on Discord. The project roadmap is available to track progress, but users should exercise caution when executing LLM-generated code using 'exec'.

robocorp
Robocorp is a platform that allows users to create, deploy, and operate Python automations and AI actions. It provides an easy way to extend the capabilities of AI agents, assistants, and copilots with custom actions written in Python. Users can create and deploy tools, skills, loaders, and plugins that securely connect any AI Assistant platform to their data and applications. The Robocorp Action Server makes Python scripts compatible with ChatGPT and LangChain by automatically creating and exposing an API based on function declaration, type hints, and docstrings. It simplifies the process of developing and deploying AI actions, enabling users to interact with AI frameworks effortlessly.

Open-Interface
Open Interface is a self-driving software that automates computer tasks by sending user requests to a language model backend (e.g., GPT-4V) and simulating keyboard and mouse inputs to execute the steps. It course-corrects by sending current screenshots to the language models. The tool supports MacOS, Linux, and Windows, and requires setting up the OpenAI API key for access to GPT-4V. It can automate tasks like creating meal plans, setting up custom language model backends, and more. Open Interface is currently not efficient in accurate spatial reasoning, tracking itself in tabular contexts, and navigating complex GUI-rich applications. Future improvements aim to enhance the tool's capabilities with better models trained on video walkthroughs. The tool is cost-effective, with user requests priced between $0.05 - $0.20, and offers features like interrupting the app and primary display visibility in multi-monitor setups.

AI-Case-Sorter-CS7.1
AI-Case-Sorter-CS7.1 is a project focused on building a case sorter using machine vision and machine learning AI to sort cases by headstamp. The repository includes Arduino code and 3D models necessary for the project.