SWE-AF

Autonomous software engineering fleet of AI agents for production grade PRs on AgentField: plan, code, test, and ship.

Stars: 212

Visit

SWE-AF is an autonomous engineering team runtime built on AgentField, designed to spin up a full engineering team that can scope, build, adapt, and ship complex software end-to-end. It enables autonomous software engineering factories, scaling from simple goals to multi-issue programs with hundreds to thousands of agent invocations. SWE-AF offers one-call DX for quick deployment and adaptive factory control using three nested control loops to adapt to task difficulty in real-time. It features a factory architecture, continual learning, agent-scale parallelism, fleet-scale orchestration with AgentField, explicit compromise tracking, and long-run reliability.

README:

SWE-AF

Autonomous Engineering Team Runtime Built on AgentField

Pronounced: "swee-AF" (one word)

One API call → full engineering team → shipped code.

Quick Start • Why SWE-AF • In Action • Factory Control • Benchmark • Modes • API • Architecture

One API call spins up a full autonomous engineering team — product managers, architects, coders, reviewers, testers — that scopes, builds, adapts, and ships complex software end to end. SWE-AF is a first step toward autonomous software engineering factories, scaling from simple goals to hard multi-issue programs with hundreds to thousands of agent invocations.

One-Call DX

curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
  -H "Content-Type: application/json" \
  -d @- <<'JSON'
{
  "input": {
    "goal": "Refactor and harden auth + billing flows",
    "repo_url": "https://github.com/user/my-project",
    "config": {
      "runtime": "claude_code",
      "models": {
        "default": "sonnet",
        "coder": "opus",
        "qa": "opus"
      },
      "enable_learning": true
    }
  }
}
JSON

Swap models.default and any role key (coder, qa, architect, etc.) to any model your runtime supports.

Operating Modes

SWE-AF works in two modes: point it at a single repository, or orchestrate coordinated changes across multiple repos in one build.

Single-Repository Mode

The default. Pass repo_url (remote) or repo_path (local) and SWE-AF handles everything:

curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "goal": "Add JWT auth",
      "repo_url": "https://github.com/user/my-project"
    }
  }'

Multi-Repository Mode

When your work spans multiple codebases — a primary app plus shared libraries, monorepo sub-projects, or dependent microservices — pass config.repos as an array with roles:

curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "goal": "Add JWT auth across API and shared-lib",
      "config": {
        "repos": [
          {
            "repo_url": "https://github.com/org/main-app",
            "role": "primary"
          },
          {
            "repo_url": "https://github.com/org/shared-lib",
            "role": "dependency"
          }
        ],
        "runtime": "claude_code",
        "models": {
          "default": "sonnet"
        }
      }
    }
  }'

Roles:

primary — The main application. Changes here drive the build; failures block progress.
dependency — Libraries or services modified to support the primary repo. Failures are captured but don't block.

Use cases:

Primary app + shared SDK or utilities library
Monorepo sub-projects that live in separate repos
Feature spanning multiple microservices (e.g., API + worker queue)

Autonomous Build Spotlight

Rust-based Python compiler benchmark (built autonomously):

Metric	CPython (subprocess)	RustPython (SWE-AF)	Improvement
Steady-state execution	Baseline (~19ms)	Optimized in-process runtime	88.3x-602.3x faster
Geometric mean	1.0x baseline	253.8x	253.8x
Peak throughput	~52 ops/s	31,807 ops/s	~612x

Measurement methodology

Throughput comparison measures different execution models: CPython subprocess spawn (~19ms per call → ~52 ops/s) vs RustPython pre-warmed interpreter pool (in-process). This is the real-world tradeoff the system was built to optimize — replacing repeated subprocess invocations with a persistent pool for short-snippet execution.

Artifact trail includes 175 tracked autonomous agents across planning, coding, review, merge, and verification.

Details: examples/llm-rust-python-compiler-sonnet/README.md

Why SWE-AF

Most agent frameworks wrap a single coder loop. SWE-AF is a coordinated engineering factory — planning, execution, and governance agents run as a control stack that adapts in real time.

Hardness-aware execution — easy issues pass through quickly, while hard issues trigger deeper adaptation and DAG-level replanning instead of blind retries.
Factory architecture — not a single-agent wrapper. Planning, execution, and governance agents run as a coordinated control stack.
Multi-model, multi-provider — assign different models per role (coder: opus, qa: haiku). Works with Claude, OpenRouter, OpenAI, and Google.
Continual learning — with enable_learning=true, conventions and failure patterns discovered early are injected into downstream issues.
Agent-scale parallelism — dependency-level scheduling + isolated git worktrees allow large fan-out without branch collisions.
Fleet-scale orchestration — many SWE-AF nodes can run continuously in parallel via AgentField, driving thousands of agent invocations across concurrent builds.
Explicit compromise tracking — when scope is relaxed, debt is typed, severity-rated, and propagated.
Long-run reliability — checkpointed execution supports resume_build after crashes or interruptions.

In Action

PR #179: Go SDK DID/VC Registration — built entirely by SWE-AF (Claude runtime with haiku-class models). One API call, zero human code.

Metric	Value
Issues completed	10/10
Tests passing	217
Acceptance criteria	34/34
Agent invocations	79
Model	`claude-haiku-4-5`
Total cost	$19.23

Cost breakdown by agent role

Role	Cost	%
Coder	$5.88	30.6%
Code Reviewer	$3.48	18.1%
QA	$1.78	9.2%
GitHub PR	$1.66	8.6%
Integration Tester	$1.59	8.3%
Merger	$1.22	6.3%
Workspace Ops	$1.77	9.2%
Planning (PM + Arch + TL + Sprint)	$0.79	4.1%
Verifier + Finalize	$0.34	1.8%
Synthesizer	$0.05	0.2%

79 invocations, 2,070 conversation turns. Planning agents scope and decompose; coders work in parallel isolated worktrees; reviewers and QA validate each issue; merger integrates branches; verifier checks acceptance criteria against the PRD.

Claude & open-source models supported: Run builds with either runtime and tune models per role in one flat config map.

runtime: "claude_code" maps to Claude backend.
runtime: "open_code" maps to OpenCode backend (OpenRouter/OpenAI/Google/Anthropic model IDs).

Adaptive Factory Control

SWE-AF uses three nested control loops to adapt to task difficulty in real time:

Loop	Scope	Trigger	Action
Inner loop	Single issue	QA/review fails	Coder retries with feedback
Middle loop	Single issue	Inner loop exhausted	`run_issue_advisor` retries with a new approach, splits work, or accepts with debt
Outer loop	Remaining DAG	Escalated failures	`run_replanner` restructures remaining issues and dependencies

This is the core factory-control behavior: control agents supervise worker agents and continuously reshape the plan as reality changes.

Quick Start

Deploy with Railway (fastest)

One click deploys SWE-AF + AgentField control plane + PostgreSQL. Set two environment variables in Railway:

CLAUDE_CODE_OAUTH_TOKEN — run claude setup-token in Claude Code CLI (uses Pro/Max subscription credits)
GH_TOKEN — GitHub personal access token with repo scope for draft PR creation

Once deployed, trigger a build:

curl -X POST https://<control-plane>.up.railway.app/api/v1/execute/async/swe-planner.build \
  -H "Content-Type: application/json" \
  -H "X-API-Key: this-is-a-secret" \
  -d '{"input": {"goal": "Add JWT auth", "repo_url": "https://github.com/user/my-repo"}}'

1. Requirements (local)

Python 3.12+
AgentField control plane (af)
AI provider API key (Anthropic, OpenRouter, OpenAI, or Google)

2. Install

python3.12 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -e ".[dev]"

3. Run

af                 # starts AgentField control plane on :8080
python -m swe_af   # registers node id "swe-planner"

4. Trigger a build

# Default (uses Claude)
curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
  -H "Content-Type: application/json" \
  -d @- <<'JSON'
{
  "input": {
    "goal": "Add JWT auth to all API endpoints",
    "repo_url": "https://github.com/user/my-project"
  }
}
JSON

# With open-source runtime + flat role map
curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
  -H "Content-Type: application/json" \
  -d @- <<'JSON'
{
  "input": {
    "goal": "Add JWT auth",
    "repo_url": "https://github.com/user/my-project",
    "config": {
      "runtime": "open_code",
      "models": {
        "default": "openrouter/minimax/minimax-m2.5"
      }
    }
  }
}
JSON

# Local workspace mode (repo_path) + targeted role override
curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
  -H "Content-Type: application/json" \
  -d @- <<'JSON'
{
  "input": {
    "goal": "Refactor and harden auth + billing flows",
    "repo_path": "/path/to/repo",
    "config": {
      "runtime": "claude_code",
      "models": {
        "default": "sonnet",
        "coder": "opus",
        "qa": "opus"
      },
      "enable_learning": true
    }
  }
}
JSON

For OpenRouter with open_code, use model IDs in openrouter/<provider>/<model> format (for example openrouter/minimax/minimax-m2.5).

What Happens In One Build

Architecture is generated and reviewed before coding starts
Issues are dependency-sorted and run in parallel across isolated worktrees
Each issue gets dedicated coder, tester, and reviewer passes
Failed issues trigger advisor-driven adaptation (split, re-scope, or escalate)
Escalations trigger replanning of the remaining DAG
End result is merged, integration-tested, and verified against acceptance criteria

Typical runs spin up 400-500+ agent instances across planning, execution, QA, and verification. For larger DAGs and repeated adaptation/replanning cycles, SWE-AF can scale into the high hundreds to thousands of agent invocations in a single build.

Benchmark

95/100 with haiku and MiniMax: SWE-AF scored 95/100 with both Claude haiku-class routing ($20) and MiniMax M2.5 via open runtime ($6), outperforming Claude Code sonnet (73), Codex o3 (62), and Claude Code haiku (59) on the same prompt.

Dimension	SWE-AF (haiku)	SWE-AF (MiniMax)	CC Sonnet	Codex (o3)	CC Haiku
Functional (30)	30	30	30	30	30
Structure (20)	20	20	10	10	10
Hygiene (20)	20	20	16	10	7
Git (15)	15	15	2	2	2
Quality (15)	10	10	15	10	10
Total	95	95	73	62	59
Cost	~$20	~$6	?	?	?
Time	~30-40 min	43 min	?	?	?

Full benchmark details and reproduction

Same prompt tested across multiple agents. SWE-AF with Claude runtime (haiku-class model mapping) used 400+ agent instances; SWE-AF with MiniMax M2.5 via open runtime achieved identical quality at 70% cost savings.

Prompt used for all agents:

Build a Node.js CLI todo app with add, list, complete, and delete commands. Data should persist to a JSON file. Initialize git, write tests, and commit your work.

Scoring framework

Dimension	Points	What it measures
Functional	30	CLI behavior and passing tests
Structure	20	Modular source layout and test organization
Hygiene	20	`.gitignore`, clean status, no junk artifacts
Git	15	Commit discipline and message quality
Quality	15	Error handling, package metadata, README quality

Reproduction

# SWE-AF (Claude runtime, haiku-class mapping) - $20, 30-40 min
curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
  -H "Content-Type: application/json" \
  -d @- <<'JSON'
{
  "input": {
    "goal": "Build a Node.js CLI todo app with add, list, complete, and delete commands. Data should persist to a JSON file. Initialize git, write tests, and commit your work.",
    "repo_path": "/tmp/swe-af-output",
    "config": {
      "runtime": "claude_code",
      "models": {
        "default": "haiku"
      }
    }
  }
}
JSON

# SWE-AF (MiniMax M2.5 via OpenRouter runtime) - $6, 43 min
curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
  -H "Content-Type: application/json" \
  -d @- <<'JSON'
{
  "input": {
    "goal": "Build a Node.js CLI todo app with add, list, complete, and delete commands. Data should persist to a JSON file. Initialize git, write tests, and commit your work.",
    "repo_path": "/workspaces/todo-app-benchmark",
    "config": {
      "runtime": "open_code",
      "models": {
        "default": "openrouter/minimax/minimax-m2.5"
      }
    }
  }
}
JSON

# Claude Code (haiku)
claude -p "Build a Node.js CLI todo app with add, list, complete, and delete commands. Data should persist to a JSON file. Initialize git, write tests, and commit your work." --model haiku --dangerously-skip-permissions

# Claude Code (sonnet)
claude -p "Build a Node.js CLI todo app with add, list, complete, and delete commands. Data should persist to a JSON file. Initialize git, write tests, and commit your work." --model sonnet --dangerously-skip-permissions

# Codex (gpt-5.3-codex)
codex exec "Build a Node.js CLI todo app with add, list, complete, and delete commands. Data should persist to a JSON file. Initialize git, write tests, and commit your work." --full-auto

MiniMax M2.5 Measured Metrics (Feb 2026):

99.22% code coverage (only agent with measured coverage)
4 custom error types (TodoError, ValidationError, NotFoundError, StorageError)
999 LOC, 4 modules, 74 tests, 9 commits

Production Quality Analysis: Objective comparison of measurable metrics across all agents.

Benchmark assets, logs, evaluator, and generated projects live in examples/agent-comparison/.

Docker

cp .env.example .env
# Add your API key: ANTHROPIC_API_KEY, OPENROUTER_API_KEY, OPENAI_API_KEY, or GOOGLE_API_KEY
# Optionally add GH_TOKEN for draft PR workflow

docker compose up -d

Submit a build:

# Default (Claude)
curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
  -H "Content-Type: application/json" \
  -d @- <<'JSON'
{
  "input": {
    "goal": "Add JWT auth",
    "repo_url": "https://github.com/user/my-repo"
  }
}
JSON

# With open-source runtime (set OPENROUTER_API_KEY in .env)
curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
  -H "Content-Type: application/json" \
  -d @- <<'JSON'
{
  "input": {
    "goal": "Add JWT auth",
    "repo_url": "https://github.com/user/my-repo",
    "config": {
      "runtime": "open_code",
      "models": {
        "default": "openrouter/minimax/minimax-m2.5"
      }
    }
  }
}
JSON

# Local workspace mode (repo_path)
curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
  -H "Content-Type: application/json" \
  -d @- <<'JSON'
{
  "input": {
    "goal": "Add JWT auth",
    "repo_path": "/workspaces/my-repo"
  }
}
JSON

Scale workers:

docker compose up --scale swe-agent=3 -d

Use a host control plane instead of Docker control-plane service:

docker compose -f docker-compose.local.yml up -d

GitHub Repo Workflow (Clone -> Build -> Draft PR)

Pass repo_url instead of repo_path to let SWE-AF clone and open a draft PR after execution.

curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
  -H "Content-Type: application/json" \
  -d @- <<'JSON'
{
  "input": {
    "repo_url": "https://github.com/user/my-project",
    "goal": "Add comprehensive test coverage",
    "config": {
      "runtime": "claude_code",
      "models": {
        "default": "sonnet",
        "coder": "opus",
        "qa": "opus"
      }
    }
  }
}
JSON

Requirements:

GH_TOKEN in .env with repo scope
Repo access for that token

API Reference

Agent endpoints

Core async endpoints (returns an execution_id immediately):

# Full build: plan -> execute -> verify
POST /api/v1/execute/async/swe-planner.build

# Plan only
POST /api/v1/execute/async/swe-planner.plan

# Execute a prebuilt plan
POST /api/v1/execute/async/swe-planner.execute

# Resume after interruption
POST /api/v1/execute/async/swe-planner.resume_build

Monitoring:

curl http://localhost:8080/api/v1/executions/<execution_id>

Every specialist is also callable directly:

POST /api/v1/execute/async/swe-planner.<agent>

Agent execution flow

Agent	In -> Out
`run_product_manager`	goal -> PRD
`run_architect`	PRD -> architecture
`run_tech_lead`	architecture -> review
`run_sprint_planner`	architecture -> issue DAG
`run_issue_writer`	issue spec -> detailed issue
`run_coder`	issue + worktree -> code + tests + commit
`run_qa`	worktree -> test results
`run_code_reviewer`	worktree -> quality/security review
`run_qa_synthesizer`	QA + review -> FIX / APPROVE / BLOCK
`run_issue_advisor`	failure context -> adapt / split / accept / escalate
`run_replanner`	build state + failures -> restructured plan
`run_merger`	branches -> merged output
`run_integration_tester`	merged repo -> integration results
`run_verifier`	repo + PRD -> acceptance pass/fail
`generate_fix_issues`	failed criteria -> targeted fix issues
`run_github_pr`	branch -> push + draft PR

Configuration

Pass config to build or execute. Full schema: swe_af/execution/schemas.py

Key	Default	Description
`runtime`	`"claude_code"`	Model runtime: `"claude_code"` or `"open_code"`
`models`	`null`	Flat role-model map (`default` + role keys below)
`max_coding_iterations`	`5`	Inner-loop retry budget
`max_advisor_invocations`	`2`	Middle-loop advisor budget
`max_replans`	`2`	Build-level replanning budget
`enable_issue_advisor`	`true`	Enable issue adaptation
`enable_replanning`	`true`	Enable global replanning
`enable_learning`	`false`	Enable cross-issue shared memory (continual learning)
`agent_timeout_seconds`	`2700`	Per-agent timeout
`agent_max_turns`	`150`	Tool-use turn budget

Model Role Keys

models supports:

default
pm, architect, tech_lead, sprint_planner
coder, qa, code_reviewer, qa_synthesizer
replan, retry_advisor, issue_writer, issue_advisor
verifier, git, merger, integration_tester

Resolution order

runtime defaults < models.default < models.<role>

Config examples

Minimal:

{
  "runtime": "claude_code"
}

Fully customized:

{
  "runtime": "open_code",
  "models": {
    "default": "minimax/minimax-m2.5",
    "pm": "openrouter/qwen/qwen-2.5-72b-instruct",
    "architect": "openrouter/qwen/qwen-2.5-72b-instruct",
    "coder": "deepseek/deepseek-chat",
    "qa": "deepseek/deepseek-chat",
    "verifier": "openrouter/qwen/qwen-2.5-72b-instruct"
  },
  "max_coding_iterations": 6,
  "enable_learning": true
}

Artifacts

.artifacts/
├── plan/           # PRD, architecture, issue specs
├── execution/      # checkpoints, per-issue logs, agent outputs
└── verification/   # acceptance criteria results

Development

make test
make check
make clean
make clean-examples

Security and Community

Contribution guide: docs/CONTRIBUTING.md
Code of conduct: CODE_OF_CONDUCT.md
Security policy: SECURITY.md
Changelog: CHANGELOG.md
License: Apache-2.0

SWE-AF is built on AgentField as a first step from single-agent harnesses to autonomous software engineering factories.

For Tasks:

Click tags to check more tools for each tasks

build software autonomously deploy engineering teams adapt to task difficulty orchestrate agent invocations track and propagate scope compromises

For Jobs:

software engineer devops engineer ai engineer automation engineer quality assurance engineer

Alternative AI tools for SWE-AF

Similar Open Source Tools

SWE-AF

github

: 212

shodh-memory

Shodh-Memory is a cognitive memory system designed for AI agents to persist memory across sessions, learn from experience, and run entirely offline. It features Hebbian learning, activation decay, and semantic consolidation, packed into a single ~17MB binary. Users can deploy it on cloud, edge devices, or air-gapped systems to enhance the memory capabilities of AI agents.

github

: 94

mengram

Mengram is an AI memory tool that goes beyond storing facts by also capturing episodic events and procedural workflows that evolve from failures. It offers multi-user isolation, a knowledge graph, and integrates with various tools like LangChain and CrewAI. Users can add conversations to automatically extract facts, events, and workflows. Mengram provides a cognitive profile based on all memories and allows importing existing data from tools like ChatGPT and Obsidian. It offers REST API for adding and searching memories, along with smart triggers and memory agents for personalized experiences. The tool is free for commercial use under the Apache 2.0 license.

github

: 55

agent-security-scanner-mcp

The 'agent-security-scanner-mcp' is a security scanner designed for AI coding agents and autonomous assistants. It scans code for vulnerabilities, detects hallucinated packages, and blocks prompt injection. The tool supports two versions: ProofLayer (lightweight) and Full Version (advanced) with different features and capabilities. It provides various tools for scanning code, fixing vulnerabilities, checking package legitimacy, and detecting prompt injection. The scanner also includes specific tools for scanning MCP servers, OpenClaw skills, and integrating with OpenClaw for autonomous AI threat detection. The tool utilizes AST analysis, taint tracking, and cross-file analysis to provide accurate security assessments. It supports multiple languages and ecosystems, offering comprehensive security coverage for various development environments.

github

: 61

Code

A3S Code is an embeddable AI coding agent framework in Rust that allows users to build agents capable of reading, writing, and executing code with tool access, planning, and safety controls. It is production-ready with features like permission system, HITL confirmation, skill-based tool restrictions, and error recovery. The framework is extensible with 19 trait-based extension points and supports lane-based priority queue for scalable multi-machine task distribution.

github

: 95

zeroclaw

ZeroClaw is a fast, small, and fully autonomous AI assistant infrastructure built with Rust. It features a lean runtime, cost-efficient deployment, fast cold starts, and a portable architecture. It is secure by design, fully swappable, and supports OpenAI-compatible provider support. The tool is designed for low-cost boards and small cloud instances, with a memory footprint of less than 5MB. It is suitable for tasks like deploying AI assistants, swapping providers/channels/tools, and pluggable everything.

github

: 20.4k

flyto-core

Flyto-core is a powerful Python library for geospatial analysis and visualization. It provides a wide range of tools for working with geographic data, including support for various file formats, spatial operations, and interactive mapping. With Flyto-core, users can easily load, manipulate, and visualize spatial data to gain insights and make informed decisions. Whether you are a GIS professional, a data scientist, or a developer, Flyto-core offers a versatile and user-friendly solution for geospatial tasks.

github

: 202

PraisonAI

Praison AI is a low-code, centralised framework that simplifies the creation and orchestration of multi-agent systems for various LLM applications. It emphasizes ease of use, customization, and human-agent interaction. The tool leverages AutoGen and CrewAI frameworks to facilitate the development of AI-generated scripts and movie concepts. Users can easily create, run, test, and deploy agents for scriptwriting and movie concept development. Praison AI also provides options for full automatic mode and integration with OpenAI models for enhanced AI capabilities.

github

: 5.6k

nextclaw

NextClaw is a feature-rich, OpenClaw-compatible personal AI assistant designed for quick trials, secondary machines, or anyone who wants multi-channel + multi-provider capabilities with low maintenance overhead. It offers a UI-first workflow, lightweight codebase, and easy configuration through a built-in UI. The tool supports various providers like OpenRouter, OpenAI, MiniMax, Moonshot, and more, along with channels such as Telegram, Discord, WhatsApp, and others. Users can perform tasks like web search, command execution, memory management, and scheduling with Cron + Heartbeat. NextClaw aims to provide a user-friendly experience with minimal setup and maintenance requirements.

github

: 60

ai-coders-context

The @ai-coders/context repository provides the Ultimate MCP for AI Agent Orchestration, Context Engineering, and Spec-Driven Development. It simplifies context engineering for AI by offering a universal process called PREVC, which consists of Planning, Review, Execution, Validation, and Confirmation steps. The tool aims to address the problem of context fragmentation by introducing a single `.context/` directory that works universally across different tools. It enables users to create structured documentation, generate agent playbooks, manage workflows, provide on-demand expertise, and sync across various AI tools. The tool follows a structured, spec-driven development approach to improve AI output quality and ensure reproducible results across projects.

github

: 380

headroom

Headroom is a tool designed to optimize the context layer for Large Language Models (LLMs) applications by compressing redundant boilerplate outputs. It intercepts context from tool outputs, logs, search results, and intermediate agent steps, stabilizes dynamic content like timestamps and UUIDs, removes low-signal content, and preserves original data for retrieval only when needed by the LLM. It ensures provider caching works efficiently by aligning prompts for cache hits. The tool works as a transparent proxy with zero code changes, offering significant savings in token count and enabling reversible compression for various types of content like code, logs, JSON, and images. Headroom integrates seamlessly with frameworks like LangChain, Agno, and MCP, supporting features like memory, retrievers, agents, and more.

github

: 586

AnyCrawl

AnyCrawl is a high-performance crawling and scraping toolkit designed for SERP crawling, web scraping, site crawling, and batch tasks. It offers multi-threading and multi-process capabilities for high performance. The tool also provides AI extraction for structured data extraction from pages, making it LLM-friendly and easy to integrate and use.

github

: 2.1k

LocalAGI

LocalAGI is a powerful, self-hostable AI Agent platform that allows you to design AI automations without writing code. It provides a complete drop-in replacement for OpenAI's Responses APIs with advanced agentic capabilities. With LocalAGI, you can create customizable AI assistants, automations, chat bots, and agents that run 100% locally, without the need for cloud services or API keys. The platform offers features like no-code agents, web-based interface, advanced agent teaming, connectors for various platforms, comprehensive REST API, short & long-term memory capabilities, planning & reasoning, periodic tasks scheduling, memory management, multimodal support, extensible custom actions, fully customizable models, observability, and more.

github

: 1.6k

Conduit

Conduit is a unified Swift 6.2 SDK for local and cloud LLM inference, providing a single Swift-native API that can target Anthropic, OpenRouter, Ollama, MLX, HuggingFace, and Apple’s Foundation Models without rewriting your prompt pipeline. It allows switching between local, cloud, and system providers with minimal code changes, supports downloading models from HuggingFace Hub for local MLX inference, generates Swift types directly from LLM responses, offers privacy-first options for on-device running, and is built with Swift 6.2 concurrency features like actors, Sendable types, and AsyncSequence.

github

: 61

skylos

Skylos is a privacy-first SAST tool for Python, TypeScript, and Go that bridges the gap between traditional static analysis and AI agents. It detects dead code, security vulnerabilities (SQLi, SSRF, Secrets), and code quality issues with high precision. Skylos uses a hybrid engine (AST + optional Local/Cloud LLM) to eliminate false positives, verify via runtime, find logic bugs, and provide context-aware audits. It offers automated fixes, end-to-end remediation, and 100% local privacy. The tool supports taint analysis, secrets detection, vulnerability checks, dead code detection and cleanup, agentic AI and hybrid analysis, codebase optimization, operational governance, and runtime verification.

github

: 317

sparrow

Sparrow is an innovative open-source solution for efficient data extraction and processing from various documents and images. It seamlessly handles forms, invoices, receipts, and other unstructured data sources. Sparrow stands out with its modular architecture, offering independent services and pipelines all optimized for robust performance. One of the critical functionalities of Sparrow - pluggable architecture. You can easily integrate and run data extraction pipelines using tools and frameworks like LlamaIndex, Haystack, or Unstructured. Sparrow enables local LLM data extraction pipelines through Ollama or Apple MLX. With Sparrow solution you get API, which helps to process and transform your data into structured output, ready to be integrated with custom workflows. Sparrow Agents - with Sparrow you can build independent LLM agents, and use API to invoke them from your system. **List of available agents:** * **llamaindex** - RAG pipeline with LlamaIndex for PDF processing * **vllamaindex** - RAG pipeline with LLamaIndex multimodal for image processing * **vprocessor** - RAG pipeline with OCR and LlamaIndex for image processing * **haystack** - RAG pipeline with Haystack for PDF processing * **fcall** - Function call pipeline * **unstructured-light** - RAG pipeline with Unstructured and LangChain, supports PDF and image processing * **unstructured** - RAG pipeline with Weaviate vector DB query, Unstructured and LangChain, supports PDF and image processing * **instructor** - RAG pipeline with Unstructured and Instructor libraries, supports PDF and image processing. Works great for JSON response generation

github

: 5.1k

For similar tasks

SWE-AF

github

: 212

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 697

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k