pyspur

A visual playground for agentic workflows: Iterate over your agents 10x faster

Stars: 4152

Visit

PySpur is a graph-based editor designed for LLM (Large Language Models) workflows. It offers modular building blocks, node-level debugging, and performance evaluation. The tool is easy to hack, supports JSON configs for workflow graphs, and is lightweight with minimal dependencies. Users can quickly set up PySpur by cloning the repository, creating a .env file, starting docker services, and accessing the portal. PySpur can also work with local models served using Ollama, with steps provided for configuration. The roadmap includes features like canvas, async/batch execution, support for Ollama, new nodes, pipeline optimization, templates, code compilation, multimodal support, and more.

README:

Iterate over your agents 10x faster. AI engineers use PySpur to iterate over AI agents visually without reinventing the wheel.

https://github.com/user-attachments/assets/54d0619f-22fd-476c-bf19-9be083d7e710

🕸️ Why PySpur?

Problem: It takes a 1,000 tiny paper cuts to make AI reliable

AI engineers today face three problems of building agents:

Prompt Hell: Hours of prompt tweaking and trial-and-error frustration.
Workflow Blindspots: Lack of visibility into step interactions causing hidden failures and confusion.
Terminal Testing Nightmare Squinting at raw outputs and manually parsing JSON.

We've been there ourselves, too. We launched a graphic design agent early 2024 and quickly reached thousands of users, yet, struggled with the lack of its reliability and existing debugging tools.

Solution: A playground for agents that saves time

✨ Core features:

👤 Human in the Loop: Persistent workflows that wait for human approval.
🔄 Loops: Iterative tool calling with memory.
📤 File Upload: Upload files or paste URLs to process documents.
📋 Structured Outputs: UI editor for JSON Schemas.
🗃️ RAG: Parse, Chunk, Embed, and Upsert Data into a Vector DB.
🖼️ Multimodal: Support for Video, Images, Audio, Texts, Code.
🧰 Tools: Slack, Firecrawl.dev, Google Sheets, GitHub, and more.
📊 Traces: Automatically capture execution traces of deployed agents.
🧪 Evals: Evaluate agents on real-world datasets.
🚀 One-Click Deploy: Publish as an API and integrate wherever you want.
🐍 Python-Based: Add new nodes by creating a single Python file.
🎛️ Any-Vendor-Support: >100 LLM providers, embedders, and vector DBs.

⚡ Quick start

This is the quickest way to get started. Python 3.11 or higher is required.

Install PySpur:
```
pip install pyspur
```
Initialize a new project:
```
pyspur init my-project
cd my-project
```
This will create a new directory with a .env file.
Start the server:
```
pyspur serve --sqlite
```
By default, this will start PySpur app at http://localhost:6080 using a sqlite database. We recommend you configure a postgres instance URL in the .env file to get a more stable experience.
[Optional] Configure Your Environment and Add API Keys:
- App UI: Navigate to API Keys tab to add provider keys (OpenAI, Anthropic, etc.)
- Manual: Edit .env file (recommended: configure postgres) and restart with pyspur serve

😎 Feature Reel

Human-in-the-loop breakpoints:

These breakpoints pause the workflow when reached and resume whenever a human approves it. They enable human oversight for workflows that require quality assurance: verify critical outputs before the workflow proceeds.

https://github.com/user-attachments/assets/98cb2b4e-207c-4d97-965b-4fee47c94ce8

Debug at Node Level:

https://github.com/user-attachments/assets/6e82ad25-2a46-4c50-b030-415ea9994690

Multimodal (Upload files or paste URLs)

PDFs, Videos, Audio, Images, ...

https://github.com/user-attachments/assets/83ed9a22-1ec1-4d86-9dd6-5d945588fd0b

Loops

RAG

Step 1) Create Document Collection (Chunking + Parsing)

https://github.com/user-attachments/assets/c77723b1-c076-4a64-a01d-6d6677e9c60e

Step 2) Create Vector Index (Embedding + Vector DB Upsert)

https://github.com/user-attachments/assets/50e5c711-dd01-4d92-bb23-181a1c5bba25

Modular Building Blocks

https://github.com/user-attachments/assets/6442f0ad-86d8-43d9-aa70-e5c01e55e876

Evaluate Final Performance

https://github.com/user-attachments/assets/4dc2abc3-c6e6-4d6d-a5c3-787d518de7ae

Coming soon: Self-improvement

https://github.com/user-attachments/assets/5bef7a16-ef9f-4650-b385-4ea70fa54c8a

🛠️ PySpur Development Setup

[ Instructions for development on Unix-like systems. Development on Windows/PC not supported ]

We recommend using Cursor/VS Code with our dev container (.devcontainer/devcontainer.json) for:

Consistent development environment with pre-configured tools and extensions
Optimized settings for Python and TypeScript development
Automatic hot-reloading and port forwarding

Option 1: Cursor/VS Code Dev Container (Recommended)

Install Cursor/VS Code and the Dev Containers extension
Clone and open the repository
Click "Reopen in Container" when prompted

Option 2: Manual Setup

Clone the repository:

git clone https://github.com/PySpur-com/pyspur.git
cd pyspur

Launch using docker-compose.dev.yml:

docker compose -f docker-compose.dev.yml up --build -d

Customize your setup: Edit .env to configure your environment (e.g., PostgreSQL settings).

Note: Manual setup requires additional configuration and may not include all dev container features.

⭐ Support us

You can support us in our work by leaving a star! Thank you!

Your feedback will be massively appreciated. Please tell us which features on that list you like to see next or request entirely new ones.

For Tasks:

Click tags to check more tools for each tasks

create workflow debug nodes evaluate performance configure local models manage llm provider keys

For Jobs:

data scientist machine learning engineer ai researcher software developer data analyst

Alternative AI tools for pyspur

Similar Open Source Tools

pyspur

github

: 4.2k

biniou

biniou is a self-hosted webui for various GenAI (generative artificial intelligence) tasks. It allows users to generate multimedia content using AI models and chatbots on their own computer, even without a dedicated GPU. The tool can work offline once deployed and required models are downloaded. It offers a wide range of features for text, image, audio, video, and 3D object generation and modification. Users can easily manage the tool through a control panel within the webui, with support for various operating systems and CUDA optimization. biniou is powered by Huggingface and Gradio, providing a cross-platform solution for AI content generation.

github

: 569

airflow-code-editor

The Airflow Code Editor Plugin is a tool designed for Apache Airflow users to edit Directed Acyclic Graphs (DAGs) directly within their browser. It offers a user-friendly file management interface for effortless editing, uploading, and downloading of files. With Git support enabled, users can store DAGs in a Git repository, explore Git history, review local modifications, and commit changes. The plugin enhances workflow efficiency by providing seamless DAG management capabilities.

github

: 416

ComfyUI-fal-API

ComfyUI-fal-API is a repository containing custom nodes for using Flux models with fal API in ComfyUI. It provides nodes for image generation, video generation, language models, and vision language models. Users can easily install and configure the repository to access various nodes for different tasks such as generating images, creating videos, processing text, and understanding images. The repository also includes troubleshooting steps and is licensed under the Apache License 2.0.

github

: 53

rtp-llm

**rtp-llm** is a Large Language Model (LLM) inference acceleration engine developed by Alibaba's Foundation Model Inference Team. It is widely used within Alibaba Group, supporting LLM service across multiple business units including Taobao, Tmall, Idlefish, Cainiao, Amap, Ele.me, AE, and Lazada. The rtp-llm project is a sub-project of the havenask.

github

: 474

shards

Shards is a high-performance, multi-platform, type-safe programming language designed for visual development. It is a dataflow visual programming language that enables building full-fledged apps and games without traditional coding. Shards features automatic type checking, optimized shard implementations for high performance, and an intuitive visual workflow for beginners. The language allows seamless round-trip engineering between code and visual models, empowering users to create multi-platform apps easily. Shards also powers an upcoming AI-powered game creation system, enabling real-time collaboration and game development in a low to no-code environment.

github

: 76

fastRAG

fastRAG is a research framework designed to build and explore efficient retrieval-augmented generative models. It incorporates state-of-the-art Large Language Models (LLMs) and Information Retrieval to empower researchers and developers with a comprehensive tool-set for advancing retrieval augmented generation. The framework is optimized for Intel hardware, customizable, and includes key features such as optimized RAG pipelines, efficient components, and RAG-efficient components like ColBERT and Fusion-in-Decoder (FiD). fastRAG supports various unique components and backends for running LLMs, making it a versatile tool for research and development in the field of retrieval-augmented generation.

github

: 1.3k

better-genshin-impact

BetterGI is a project based on computer vision technology, which aims to make Genshin Impact better. It can automatically pick up items, skip dialogues, automatically select options, automatically submit items, close pop-up pages, etc. When talking to Katherine, it can automatically receive the "Daily Commission" rewards and automatically re-dispatch. When the automatic plot function is turned on, this function will take effect, and the invitation options will be automatically selected. AI recognizes automatic casting, automatically reels in when the fish is hooked, and automatically completes the fishing progress. Help you easily complete the Seven Saint Summoning character invitation, weekly visitor challenge and other PVE content. Automatically use the "King Tree Blessing" with the `Z` key, and use the principle of refreshing wood by going online and offline to hang up a backpack full of wood. Write combat scripts to let the team fight automatically according to your strategy. Fully automatic secret realm hangs up to restore physical strength, automatically enters the secret realm to open the key, fight, walk to the ancient tree and receive rewards. Click the teleportation point on the map, or if there is a teleportation point in the list that appears after clicking, it will automatically click the teleportation point and teleport. Set a shortcut key, and long press to continuously rotate the perspective horizontally (of course you can also use it to rotate the grass god). Quickly switch between "Details" and "Enhance" pages to skip the display of holy relic enhancement results and quickly +20. You can quickly purchase items in the store in full quantity, which is suitable for quickly clearing event redemptions,塵歌壺 store redemptions, etc.

github

: 7.9k

CrewAI-Studio

CrewAI Studio is an application with a user-friendly interface for interacting with CrewAI, offering support for multiple platforms and various backend providers. It allows users to run crews in the background, export single-page apps, and use custom tools for APIs and file writing. The roadmap includes features like better import/export, human input, chat functionality, automatic crew creation, and multiuser environment support.

github

: 682

promptfoo

Promptfoo is a tool for testing and evaluating LLM output quality. With promptfoo, you can build reliable prompts, models, and RAGs with benchmarks specific to your use-case, speed up evaluations with caching, concurrency, and live reloading, score outputs automatically by defining metrics, use as a CLI, library, or in CI/CD, and use OpenAI, Anthropic, Azure, Google, HuggingFace, open-source models like Llama, or integrate custom API providers for any LLM API.

github

: 6.1k

duolingo-clone

Lingo is an interactive platform for language learning that provides a modern UI/UX experience. It offers features like courses, quests, and a shop for users to engage with. The tech stack includes React JS, Next JS, Typescript, Tailwind CSS, Vercel, and Postgresql. Users can contribute to the project by submitting changes via pull requests. The platform utilizes resources from CodeWithAntonio, Kenney Assets, Freesound, Elevenlabs AI, and Flagpack. Key dependencies include @clerk/nextjs, @neondatabase/serverless, @radix-ui/react-avatar, and more. Users can follow the project creator on GitHub and Twitter, as well as subscribe to their YouTube channel for updates. To learn more about Next.js, users can refer to the Next.js documentation and interactive tutorial.

github

: 104

ComfyUI-Copilot

ComfyUI-Copilot is an intelligent assistant built on the Comfy-UI framework that simplifies and enhances the AI algorithm debugging and deployment process through natural language interactions. It offers intuitive node recommendations, workflow building aids, and model querying services to streamline development processes. With features like interactive Q&A bot, natural language node suggestions, smart workflow assistance, and model querying, ComfyUI-Copilot aims to lower the barriers to entry for beginners, boost development efficiency with AI-driven suggestions, and provide real-time assistance for developers.

github

: 949

chatbox

Chatbox is a desktop client for ChatGPT, Claude, and other LLMs, providing a user-friendly interface for AI copilot assistance on Windows, Mac, and Linux. It offers features like local data storage, multiple LLM provider support, image generation with Dall-E-3, enhanced prompting, keyboard shortcuts, and more. Users can collaborate, access the tool on various platforms, and enjoy multilingual support. Chatbox is constantly evolving with new features to enhance the user experience.

github

: 24.8k

llama-assistant

Llama Assistant is a local AI assistant that respects your privacy. It is an AI-powered assistant that can recognize your voice, process natural language, and perform various actions based on your commands. It can help with tasks like summarizing text, rephrasing sentences, answering questions, writing emails, and more. The assistant runs offline on your local machine, ensuring privacy by not sending data to external servers. It supports voice recognition, natural language processing, and customizable UI with adjustable transparency. The project is a work in progress with new features being added regularly.

github

: 485

chatbox

Chatbox is a desktop client for ChatGPT, Claude, and other LLMs, providing features like local data storage, multiple LLM provider support, image generation, enhanced prompting, keyboard shortcuts, and more. It offers a user-friendly interface with dark theme, team collaboration, cross-platform availability, web version access, iOS & Android apps, multilingual support, and ongoing feature enhancements. Developed for prompt and API debugging, it has gained popularity for daily chatting and professional role-playing with AI assistance.

github

: 33.5k

fast-llm-security-guardrails

ZenGuard AI enables AI developers to integrate production-level, low-code LLM (Large Language Model) guardrails into their generative AI applications effortlessly. With ZenGuard AI, ensure your application operates within trusted boundaries, is protected from prompt injections, and maintains user privacy without compromising on performance.

github

: 93

For similar tasks

pyspur

github

: 4.2k

airflow

Apache Airflow (or simply Airflow) is a platform to programmatically author, schedule, and monitor workflows. When workflows are defined as code, they become more maintainable, versionable, testable, and collaborative. Use Airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The Airflow scheduler executes your tasks on an array of workers while following the specified dependencies. Rich command line utilities make performing complex surgeries on DAGs a snap. The rich user interface makes it easy to visualize pipelines running in production, monitor progress, and troubleshoot issues when needed.

github

: 39.5k

rill-flow

Rill Flow is a high-performance, scalable distributed workflow orchestration service that supports the execution of tens of millions of tasks per day with task execution latency less than 100ms. It is distributed and supports the orchestration and scheduling of heterogeneous distributed systems. Rill Flow is easy to use, supporting visual process orchestration and plug-in access. It is cloud native, allowing for cloud native container deployment and cloud native function orchestration. Additionally, Rill Flow supports rapid integration of LLM model services.

github

: 306

PySpur

PySpur is a graph-based editor designed for LLM workflows, offering modular building blocks for easy workflow creation and debugging at node level. It allows users to evaluate final performance and promises self-improvement features in the future. PySpur is easy-to-hack, supports JSON configs for workflow graphs, and is lightweight with minimal dependencies, making it a versatile tool for workflow management in the field of AI and machine learning.

github

: 975

Co-LLM-Agents

This repository contains code for building cooperative embodied agents modularly with large language models. The agents are trained to perform tasks in two different environments: ThreeDWorld Multi-Agent Transport (TDW-MAT) and Communicative Watch-And-Help (C-WAH). TDW-MAT is a multi-agent environment where agents must transport objects to a goal position using containers. C-WAH is an extension of the Watch-And-Help challenge, which enables agents to send messages to each other. The code in this repository can be used to train agents to perform tasks in both of these environments.

github

: 202

GPT4Point

GPT4Point is a unified framework for point-language understanding and generation. It aligns 3D point clouds with language, providing a comprehensive solution for tasks such as 3D captioning and controlled 3D generation. The project includes an automated point-language dataset annotation engine, a novel object-level point cloud benchmark, and a 3D multi-modality model. Users can train and evaluate models using the provided code and datasets, with a focus on improving models' understanding capabilities and facilitating the generation of 3D objects.

github

: 253

asreview

The ASReview project implements active learning for systematic reviews, utilizing AI-aided pipelines to assist in finding relevant texts for search tasks. It accelerates the screening of textual data with minimal human input, saving time and increasing output quality. The software offers three modes: Oracle for interactive screening, Exploration for teaching purposes, and Simulation for evaluating active learning models. ASReview LAB is designed to support decision-making in any discipline or industry by improving efficiency and transparency in screening large amounts of textual data.

github

: 709

Groma

Groma is a grounded multimodal assistant that excels in region understanding and visual grounding. It can process user-defined region inputs and generate contextually grounded long-form responses. The tool presents a unique paradigm for multimodal large language models, focusing on visual tokenization for localization. Groma achieves state-of-the-art performance in referring expression comprehension benchmarks. The tool provides pretrained model weights and instructions for data preparation, training, inference, and evaluation. Users can customize training by starting from intermediate checkpoints. Groma is designed to handle tasks related to detection pretraining, alignment pretraining, instruction finetuning, instruction following, and more.

github

: 374

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 855

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.3k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 30.6k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675

pyspur

README:

🕸️ Why PySpur?

Problem: It takes a 1,000 tiny paper cuts to make AI reliable

Solution: A playground for agents that saves time

Step 1: Define Test Cases

Step 2: Build the agent in Python code or via UI

Step 3: Iterate obsessively

Step 4: Deploy

✨ Core features:

⚡ Quick start

😎 Feature Reel

Human-in-the-loop breakpoints:

Debug at Node Level:

Multimodal (Upload files or paste URLs)

Loops

RAG

Step 1) Create Document Collection (Chunking + Parsing)

Step 2) Create Vector Index (Embedding + Vector DB Upsert)

Modular Building Blocks

Evaluate Final Performance

Coming soon: Self-improvement

🛠️ PySpur Development Setup

[ Instructions for development on Unix-like systems. Development on Windows/PC not supported ]

⭐ Support us

For Tasks:

For Jobs:

Alternative AI tools for pyspur

Similar Open Source Tools

pyspur

biniou

airflow-code-editor

ComfyUI-fal-API

rtp-llm

shards

fastRAG

better-genshin-impact

CrewAI-Studio

promptfoo

duolingo-clone

ComfyUI-Copilot

chatbox

llama-assistant

chatbox

fast-llm-security-guardrails

For similar tasks

pyspur

airflow

rill-flow

PySpur

Co-LLM-Agents

GPT4Point

asreview

Groma

For similar jobs

weave

LLMStack

VisionCraft

kaito

PyRIT

tabby

spear

Magick