
open-webui-tools
OpenโWebUI Tools is a modular toolkit designed to extend and enrich your Open WebUI instance, turning it into a powerful AI workstation. With a suite of over 15 specialized tools, function pipelines, and filters, this project supports academic research, agentic autonomy, multimodal creativity, workflows, and more
Stars: 348

Open WebUI Tools Collection is a set of tools for structured planning, arXiv paper search, Hugging Face text-to-image generation, prompt enhancement, and multi-model conversations. It enhances LLM interactions with academic research, image generation, and conversation management. Tools include arXiv Search Tool and Hugging Face Image Generator. Function Pipes like Planner Agent offer autonomous plan generation and execution. Filters like Prompt Enhancer improve prompt quality. Installation and configuration instructions are provided for each tool and pipe.
README:
๐ A modular collection of tools, function pipes, and filters to supercharge your Open WebUI experience.
Transform your Open WebUI instance into a powerful AI workstation with this comprehensive toolkit. From academic research and image generation to music creation and autonomous agents, this collection provides everything you need to extend your AI capabilities.
This repository contains 15+ specialized tools and functions designed to enhance your Open WebUI experience:
- arXiv Search - Academic paper discovery (no API key required!)
- Perplexica Search - Web search using Perplexica API with citations
- Pexels Media Search - High-quality photos and videos from Pexels API
- Native Image Generator - Direct Open WebUI image generation with Ollama model management
- Hugging Face Image Generator - AI-powered image creation
- ComfyUI ACE Step Audio - Advanced music generation
- Flux Kontext ComfyUI - Professional image editing
- Planner Agent v2 - Advanced autonomous agent with specialized models, interactive guidance, and comprehensive execution management
- arXiv Research MCTS - Advanced research with Monte Carlo Tree Search
- Multi Model Conversations - Multi-agent discussions
- Resume Analyzer - Professional resume analysis
- Mopidy Music Controller - Music server management
- Letta Agent - Autonomous agent integration
- MCP Pipe - Model Context Protocol integration
- Prompt Enhancer - Automatic prompt improvement
- Semantic Router - Intelligent model selection
- Full Document - File processing capabilities
- Clean Thinking Tags - Conversation cleanup
- Visit https://openwebui.com/u/haervwe
- Browse the collection and click "Get" for desired tools
- Follow the installation prompts in your Open WebUI instance
- Copy
.py
files fromtools/
,functions/
, orfilters/
directories - Navigate to Open WebUI Workspace > Tools/Functions/Filters
- Paste the code, provide a name and description, then save
- ๐ Plug-and-Play: Most tools work out of the box with minimal configuration
- ๐จ Visual Integration: Seamless integration with ComfyUI workflows
- ๐ค AI-Powered: Advanced features like MCTS research and autonomous planning
- ๐ Academic Focus: arXiv integration for research and academic work
- ๐ต Creative Tools: Music generation and image editing capabilities
- ๐ Smart Routing: Intelligent model selection and conversation management
- ๐ Document Processing: Full document analysis and resume processing
- Open WebUI: Version 0.6.0+ recommended
- Python: 3.8 or higher
-
Optional Dependencies:
- ComfyUI (for image/music generation tools)
- Mopidy (for music controller)
- Various API keys (Hugging Face, Tavily, etc.)
Most tools are designed to work with minimal configuration. Key configuration areas:
- API Keys: Required for some tools (Hugging Face, Tavily, etc.)
- ComfyUI Integration: For image and music generation tools
- Model Selection: Choose appropriate models for your use case
- Filter Setup: Enable filters in your model configuration
- arXiv Search Tool
- Perplexica Search Tool
- Pexels Media Search Tool
- Native Image Generator
- Hugging Face Image Generator
- Cloudflare Workers AI Image Generator
- SearxNG Image Search Tool
- ComfyUI ACE Step Audio Tool
- Flux Kontext ComfyUI Pipe
- Planner Agent v2
- arXiv Research MCTS Pipe
- Multi Model Conversations Pipe
- Resume Analyzer Pipe
- Mopidy Music Controller
- Letta Agent Pipe
- MCP Pipe
- Prompt Enhancer Filter
- Semantic Router Filter
- Full Document Filter
- Clean Thinking Tags Filter
- Using the Provided ComfyUI Workflows
- Installation
- Contributing
- License
- Credits
- Support
Search arXiv.org for relevant academic papers on any topic. No API key required!
- No configuration required. Works out of the box.
-
Example:
Search for recent papers about "tree of thought"
- Returns up to 5 most relevant papers, sorted by most recent.
Example arXiv search result in Open WebUI
Search the web for factual information, current events, or specific topics using the Perplexica API. This tool provides comprehensive search results with citations and sources, making it ideal for research and information gathering. Perplexica is an open-source AI-powered search engine and alternative to Perplexity AI that must be self-hosted locally. It uses advanced language models to provide accurate, contextual answers with proper source attribution.
-
BASE_URL
(str): Base URL for the Perplexica API (default:http://host.docker.internal:3001
) -
OPTIMIZATION_MODE
(str): Search optimization mode - "speed" or "balanced" (default:balanced
) -
CHAT_MODEL
(str): Default chat model for search processing (default:llama3.1:latest
) -
EMBEDDING_MODEL
(str): Default embedding model for search (default:bge-m3:latest
) -
OLLAMA_BASE_URL
(str): Base URL for Ollama API (default:http://host.docker.internal:11434
)
Prerequisites: You must have Perplexica installed and running locally at the configured URL. Perplexica is a self-hosted open-source search engine that requires Ollama with the specified chat and embedding models. Follow the installation instructions in the Perplexica repository to set up your local instance.
-
Example:
Search for "latest developments in AI safety research 2024"
- Returns comprehensive search results with proper citations
- Automatically emits citations for source tracking in Open WebUI
- Provides both summary and individual source links
- Web Search Integration: Direct access to current web information
- Citation Support: Automatic citation generation for Open WebUI
- Model Flexibility: Configurable chat and embedding models
- Real-time Status: Progress updates during search execution
- Source Tracking: Individual source citations with metadata
Search and retrieve high-quality photos and videos from the Pexels API. This tool provides access to Pexels' extensive collection of free stock photos and videos, with comprehensive search capabilities, automatic citation generation, and direct image display in chat. Perfect for finding professional-quality media for presentations, content creation, or creative projects.
-
PEXELS_API_KEY
(str): Free Pexels API key from https://www.pexels.com/api/ (required) -
DEFAULT_PER_PAGE
(int): Default number of results per search (default: 5, recommended for LLMs) -
MAX_RESULTS_PER_PAGE
(int): Maximum allowed results per page (default: 15, prevents overwhelming LLMs) -
DEFAULT_ORIENTATION
(str): Default photo orientation - "all", "landscape", "portrait", or "square" (default: "all") -
DEFAULT_SIZE
(str): Default minimum photo size - "all", "large" (24MP), "medium" (12MP), or "small" (4MP) (default: "all")
Prerequisites: Get a free API key from Pexels API and configure it in the tool's Valves settings.
-
Photo Search Example:
Search for photos of "modern office workspace"
-
Video Search Example:
Search for videos of "ocean waves at sunset"
-
Curated Photos Example:
Get curated photos from Pexels
-
Three Search Functions:
search_photos
,search_videos
, andget_curated_photos
- Direct Image Display: Images are automatically formatted with markdown for immediate display in chat
- Advanced Filtering: Filter by orientation, size, color, and quality
- Attribution Support: Automatic citation generation with photographer credits
- Rate Limit Handling: Built-in error handling for API limits and invalid keys
- LLM Optimized: Results are limited and formatted to prevent overwhelming language models
- Real-time Status: Progress updates during search execution
Generate images using Open WebUI's native image generation middleware configured in admin settings. This tool leverages whatever image generation backend you have configured (such as AUTOMATIC1111, ComfyUI, or OpenAI DALL-E) through Open WebUI's built-in image generation system, with optional Ollama model management to free up VRAM when needed.
-
unload_ollama_models
(bool): Whether to unload all Ollama models from VRAM before generating images (default:False
) -
ollama_url
(str): Ollama API URL for model management (default:http://host.docker.internal:11434
)
Prerequisites: You must have image generation configured in Open WebUI's admin settings under Settings > Images. This tool works with any image generation backend you have set up (AUTOMATIC1111, ComfyUI, OpenAI, etc.).
-
Example:
Generate an image of "a serene mountain landscape at sunset"
- Uses whatever image generation backend is configured in Open WebUI admin settings
- Automatically manages model resources if Ollama unloading is enabled
- Returns markdown-formatted image links for immediate display
- Native Integration: Uses Open WebUI's native image generation middleware without external dependencies
- Backend Agnostic: Works with any image generation backend configured in admin settings (AUTOMATIC1111, ComfyUI, OpenAI, etc.)
- Memory Management: Optional Ollama model unloading to optimize VRAM usage
- Flexible Model Support: You can prompt de agent to change the image generation model, providing the name is given to it.
- Real-time Status: Provides generation progress updates via event emitter
- Error Handling: Comprehensive error reporting and recovery
Generate high-quality images from text descriptions using Hugging Face's Stable Diffusion models.
- API Key (Required): Obtain a Hugging Face API key from your HuggingFace account and set it in the tool's configuration in Open WebUI.
- API URL (Optional): Uses Stability AI's SD 3.5 Turbo model as default. Can be customized to use other HF text-to-image model endpoints.
-
Example:
Create an image of "beautiful horse running free"
- Multiple image format options: Square, Landscape, Portrait, etc.
Example image generated with Hugging Face tool
Generate images using Cloudflare Workers AI text-to-image models, including FLUX, Stable Diffusion XL, SDXL Lightning, and DreamShaper LCM. This tool provides model-specific prompt preprocessing, parameter optimization, and direct image display in chat. It supports fast and high-quality image generation with minimal configuration.
-
cloudflare_api_token
(str): Your Cloudflare API Token (required) -
cloudflare_account_id
(str): Your Cloudflare Account ID (required) -
default_model
(str): Default model to use (e.g.,@cf/black-forest-labs/flux-1-schnell
)
Prerequisites: Obtain a Cloudflare API Token and Account ID from your Cloudflare dashboard. No additional dependencies beyond requests
.
-
Example:
# Generate an image with a prompt await tools.generate_image(prompt="A futuristic cityscape at sunset, vibrant colors")
- Returns a markdown-formatted image link for immediate display in chat.
- Multiple Models: Supports FLUX, SDXL, SDXL Lightning, DreamShaper LCM
- Prompt Optimization: Automatic prompt enhancement for best results per model
- Parameter Handling: Smart handling of steps, guidance, negative prompts, and size
- Direct Image Display: Returns markdown image links for chat
- Error Handling: Comprehensive error and status reporting
- Real-time Status: Progress updates via event emitter
Search and retrieve images from the web using a self-hosted SearxNG instance. This tool provides privacy-respecting, multi-engine image search with direct image display in chat. Ideal for finding diverse images from multiple sources without tracking or ads.
-
SEARXNG_ENGINE_API_BASE_URL
(str): The base URL for the SearxNG search engine API (default:http://searxng:4000/search
) -
MAX_RESULTS
(int): Maximum number of images to return per search (default: 5)
Prerequisites: You must have a running SearxNG instance. See SearxNG documentation for setup instructions.
-
Example:
# Search for images of cats await tools.search_images(query="cats", max_results=3)
- Returns a list of markdown-formatted image links for immediate display in chat.
- Privacy-Respecting: No tracking, ads, or profiling
- Multi-Engine: Aggregates results from multiple search engines
- Direct Image Display: Images are formatted for chat display
- Customizable: Choose engines, result count, and more
- Error Handling: Handles connection and search errors gracefully
Generate music using the ACE Step AI model via ComfyUI. This tool lets you create songs from tags and lyrics, with full control over the workflow JSON and node numbers. Designed for advanced music generation and can be customized for different genres and moods.
-
comfyui_api_url
(str): ComfyUI API endpoint (e.g.,http://localhost:8188
) -
model_name
(str): Model checkpoint to use (default:ACE_STEP/ace_step_v1_3.5b.safetensors
) -
workflow_json
(str): Full ACE Step workflow JSON as a string. Use{tags}
,{lyrics}
, and{model_name}
as placeholders. -
tags_node
(str): Node number for the tags input (default:"14"
) -
lyrics_node
(str): Node number for the lyrics input (default:"14"
) -
model_node
(str): Node number for the model checkpoint input (default:"40"
)
-
Import the ACE Step workflow:
- In ComfyUI, go to the workflow import section and load
extras/ace_step_api.json
. - Adjust nodes as needed for your setup.
- In ComfyUI, go to the workflow import section and load
-
Configure the tool in Open WebUI:
- Set the
comfyui_api_url
to your ComfyUI backend. - Paste the workflow JSON (from the file or your own) into
workflow_json
. - Set the correct node numbers if you modified the workflow.
- Set the
-
Generate music:
- Provide tags and (optionally) lyrics.
- The tool will return a link to the generated audio file.
-
Example:
Generate a song in the style of "funk, pop, soul" with the following lyrics: "In the shadows where secrets hide..."
Returns a link to the generated audio or a status message. Advanced users can fully customize the workflow for different genres, moods, or creative experiments.
Connects Open WebUI to the Flux Kontext image-to-image editing model via ComfyUI. This pipe enables advanced image editing, style transfer, and creative transformations using the Flux Kontext workflow.
-
ComfyUI_Address
(str): Address of the running ComfyUI server (default:http://127.0.0.1:8188
) -
ComfyUI_Workflow_JSON
(str): The entire ComfyUI workflow in JSON format (default provided, or useextras/flux_context_owui_api_v1.json
) -
Prompt_Node_ID
(str): Node ID for the text prompt (default:"6"
) -
Image_Node_ID
(str): Node ID for the input image (default:"196"
) -
Seed_Node_ID
(str): Node ID for the sampler (default:"194"
) -
enhance_prompt
(bool): Use a vision model to enhance the prompt based on the input image (default:False
). -
vision_model_id
(str): The model ID to use for vision-based prompt enhancement (required ifenhance_prompt
is enabled). -
enhancer_system_prompt
(str): System prompt used to guide the vision model when enhancing the prompt. This allows you to customize the instructions given to the vision-language model for prompt engineering. By default, it provides detailed instructions for visual prompt enhancement, but you can modify it to fit your workflow or style. -
unload_ollama_models
(bool): Unload all Ollama models from VRAM before running (default:False
) -
ollama_url
(str): Ollama API URL for unloading models (default:http://host.docker.internal:11434
) -
max_wait_time
(int): Max wait time for generation in seconds (default:1200
)
-
Import the Flux Kontext workflow:
- In ComfyUI, import
extras/flux_context_owui_api_v1.json
as a workflow. - Adjust node IDs if you modify the workflow.
- In ComfyUI, import
-
Configure the pipe in Open WebUI:
- Set the
ComfyUI_Address
to your ComfyUI backend. - Paste the workflow JSON into
ComfyUI_Workflow_JSON
. - Set the correct node IDs for prompt, image, and sampler.
- Set the
-
Edit images:
- Provide a prompt and an input image.
-
(Optional) Enable
enhance_prompt
and specify avision_model_id
to automatically improve your prompt using a vision-language model and the input image. The enhanced prompt will be used for image editing and shown in the chat. - The pipe will return the edited image.
-
Example:
Edit this image to look like a medieval fantasy king, preserving facial features. # (If enhance_prompt is enabled, the vision model will refine this prompt based on the image)
Example of Flux Kontext ComfyUI Pipe output
Advanced autonomous agent with specialized model support, interactive user guidance, and comprehensive execution management.
This powerful agent autonomously generates and executes multi-step plans to achieve complex goals. It's a generalist agent capable of handling any text-based task, making it ideal for complex requests that would typically require multiple prompts and manual intervention.
- ๐ง Intelligent Planning: Automatically breaks down goals into actionable steps with dependency mapping
- ๐จ Specialized Models: Dedicated models for writing (WRITER_MODEL), coding (CODER_MODEL), and tool usage (ACTION_MODEL) with automatic routing
- ๐ Quality Control: Real-time output analysis with quality scoring (0.0-1.0) and iterative improvement
- ๐ญ Interactive Error Handling: When actions fail or produce low-quality outputs, the system pauses and prompts you with options: retry with custom guidance/instructions, retry as-is, approve current output despite warnings, or abort the entire plan execution
- ๐ Live Progress: Real-time Mermaid diagrams with color-coded status indicators
-
๐งฉ Template System: Final synthesis using
{{action_id}}
placeholders for seamless content assembly - ๐ง Native Tool Integration: Automatically discovers and uses all available Open WebUI tools
-
โก Advanced Features: Lightweight context mode, concurrent execution, cross-action references (
@action_id
), and comprehensive validation - ๐ฎ MCP(OpenAPI servers) Support: Model Context Protocol integration coming soon for extended tool capabilities
Core Models:
-
MODEL
: Main planning LLM -
ACTION_MODEL
: Tool-based actions and general tasks -
WRITER_MODEL
: Creative writing and documentation -
CODER_MODEL
: Code generation and development
Temperature Controls:
-
PLANNING_TEMPERATURE
(0.8): Planning creativity -
ACTION_TEMPERATURE
(0.7): Tool execution precision -
WRITER_TEMPERATURE
(0.9): Creative writing freedom -
CODER_TEMPERATURE
(0.3): Code generation accuracy -
ANALYSIS_TEMPERATURE
(0.4): Output analysis precision
Execution Settings:
-
MAX_RETRIES
(3): Retry attempts per action -
CONCURRENT_ACTIONS
(1): Parallel processing limit -
ACTION_TIMEOUT
(300): Individual action timeout -
SHOW_ACTION_SUMMARIES
(true): Detailed execution summaries -
AUTOMATIC_TAKS_REQUIREMENT_ENHANCEMENT
(false): AI-enhanced requirements
Multi-Media Content:
search the latest AI news and create a song based on that, with that , search for stock images to use a โalbum coverโ and create a mockup of the spotify in a plain html file with vanilla js layout using those assets embeded for interactivity
Example of Planner Agent in action Using Gemini 2.5 flash and local music generation
Creative Writing:
create an epic sci fi Adult novel based on the current trends on academia news and social media about AI and other trending topics, with at least 10 chapters, well crafter world with rich characters , save each chapter in a folter named as the novel in obsidian with an illustration
Example of Planner Agent in action Using Gemini 2.5 flash and local image generation, local saving to obsidian and websearch
Interactive Error Recovery: The Planner Agent features intelligent error handling that engages with users when actions fail or produce suboptimal results. When issues occur, the system pauses execution and presents you with interactive options:
- Retry with Guidance: Provide custom instructions to help the agent understand what went wrong and how to improve
- Retry As-Is: Attempt the action again without modifications
- Approve Output: Accept warning-level outputs despite quality concerns
- Abort Execution: Stop the entire plan if the issue is critical
Example scenario: If an action fails to generate proper code or retrieve expected data,
you'll be prompted to either provide specific guidance ("try using a different approach")
or decide whether to continue with the current output.
Interactive error recovery dialog showing user options when an action encounters issues during plan execution
Technical Development:
Create a fully-featured Conway's Game of Life SPA with responsive UI, game controls, and pattern presets using vanilla HTML/CSS/JS
Example of Planner Agent in action Using local Hermes 8b (previous verision of the script)
Search arXiv.org for relevant academic papers and iteratively refine a research summary using a Monte Carlo Tree Search (MCTS) approach.
-
model
: The model ID from your LLM provider -
tavily_api_key
: Required. Obtain your API key from tavily.com -
max_web_search_results
: Number of web search results to fetch per query -
max_arxiv_results
: Number of results to fetch from the arXiv API per query -
tree_breadth
: Number of child nodes explored per MCTS iteration -
tree_depth
: Number of MCTS iterations -
exploration_weight
: Controls balance between exploration and exploitation -
temperature_decay
: Exponentially decreases LLM temperature with tree depth -
dynamic_temperature_adjustment
: Adjusts temperature based on parent node scores -
maximum_temperature
: Initial LLM temperature (default 1.4) -
minimum_temperature
: Final LLM temperature at max tree depth (default 0.5)
-
Example:
Do a research summary on "DPO laser LLM training"
Example of arXiv Research MCTS Pipe output
Simulate conversations between multiple language models, each acting as a distinct character. Configure up to 5 participants.
-
number_of_participants
: Set the number of participants (1-5) -
rounds_per_user_message
: How many rounds of replies before the user can send another message -
participant_[1-5]_model
: Model for each participant -
participant_[1-5]_alias
: Display name for each participant -
participant_[1-5]_system_message
: Persona and instructions for each participant -
all_participants_appended_message
: Global instruction appended to each prompt -
temperature
,top_k
,top_p
: Standard model parameters
-
Example:
Start a conversation between three AI agents about climate change.
Example of Multi Model Conversations Pipe
Analyze resumes and provide tags, first impressions, adversarial analysis, potential interview questions, and career advice.
-
model
: The model ID from your LLM provider -
dataset_path
: Local path to the resume dataset CSV file -
rapidapi_key
(optional): For job search functionality -
web_search
: Enable/disable web search for relevant job postings -
prompt_templates
: Customizable templates for all steps
- Requires the Full Document Filter (see below) to work with attached files.
- Example:
Analyze this resume:
[Attach resume file]
Screenshots of Resume Analyzer Pipe output
Control your Mopidy music server to play songs from the local library or YouTube, manage playlists, and handle various music commands.
-
model
: The model ID from your LLM provider -
mopidy_url
: URL for the Mopidy JSON-RPC API endpoint (default:http://localhost:6680/mopidy/rpc
) -
youtube_api_key
: YouTube Data API key for search -
temperature
: Model temperature (default: 0.7) -
max_search_results
: Maximum number of search results to return (default: 5) -
use_iris
: Toggle to use Iris interface or custom HTML UI (default: True) -
system_prompt
: System prompt for request analysis
-
Example:
Play the song "Imagine" by John Lennon
- Quick text commands: stop, halt, play, start, resume, continue, next, skip, pause
Example of Mopidy Music Controller Pipe
Connect with Letta agents, enabling seamless integration of autonomous agents into Open WebUI conversations. Supports task-specific processing and maintains conversation context while communicating with the agent API.
-
agent_id
: The ID of the Letta agent to communicate with -
api_url
: Base URL for the Letta agent API (default:http://localhost:8283
) -
api_token
: Bearer token for API authentication -
task_model
: Model to use for title/tags generation tasks -
custom_name
: Name of the agent to be displayed -
timeout
: Timeout to wait for Letta agent response in seconds (default: 400)
-
Example:
Chat with the built in Long Term memory Letta MemGPT agent.
The MCP Pipe integrates the Model Context Protocol (MCP) into Open WebUI, enabling seamless connections between AI assistants and various data sources, tools, and development environments. Note: This implementation only works with Python-based MCP servers. NPX or other server types are not supported by default.
MCP is a universal, open standard that replaces fragmented integrations with a single protocol for connecting AI systems with data sources. This allows you to:
- Connect to multiple MCP servers simultaneously (Python servers only)
- Access tools and prompts from connected servers
- Process queries using context-aware tools
- Support data repositories, business tools, and development environments
- Automatically discover tools and prompts
- Stream responses from tools
- Maintain conversation context across different data sources
- Open WebUI: Make sure you are running a compatible version (0.5.0+ recommended)
- Python MCP servers: You must have one or more MCP-compatible servers installed and accessible (see open-webui/openapi-servers for examples)
-
MCP configuration file: A
config.json
file must be placed in the/data/
folder inside your Open WebUI installation - Python environment: Any additional MCP servers you add must be installed in the Open WebUI Python environment
-
Install or set up your MCP servers
- Example: mcp_server_time for time and timezone conversion, mcp_server_tavily for web search
- Install via pip or clone and install as needed
-
Create the MCP configuration file
- Place a
config.json
file in the/data/
directory of your Open WebUI installation - Example
config.json
:{ "mcpServers": { "time_server": { "command": "python", "args": ["-m", "mcp_server_time", "--local-timezone=America/New_York"], "description": "Provides Time and Timezone conversion tools." }, "tavily_server": { "command": "python", "args": ["-m", "mcp_server_tavily", "--api-key=tvly-xxx"], "description": "Provides web search capabilities tools." } } }
- Replace
tvly-xxx
with your actual Tavily API key - Add additional servers as needed, following the same structure
- Place a
-
Install any required MCP servers
- For each server listed in your config, ensure it is installed in the Open WebUI Python environment
- Example:
pip install mcp_server_time
or clone and install from source
-
Restart Open WebUI
- This ensures the new configuration and servers are loaded
-
Configure the MCP Pipe in Open WebUI
- Set the valves as needed (see below)
-
MODEL
: (default: "Qwen2_5_16k:latest") The LLM model to use for MCP queries -
OPENAI_API_KEY
: Your OpenAI API key for API access (if using OpenAI-compatible models) -
OPENAI_API_BASE
: (default: "http://0.0.0.0:11434/v1") Base URL for API requests -
TEMPERATURE
: (default: 0.5) Controls randomness in responses (0.0-1.0) -
MAX_TOKENS
: (default: 1500) Maximum tokens to generate -
TOP_P
: (default: 0.8) Top-p sampling parameter -
PRESENCE_PENALTY
: (default: 0.8) Penalty for repeating topics -
FREQUENCY_PENALTY
: (default: 0.8) Penalty for repeating tokens
# Example usage in your prompt
Use the time_server to get the current time in New York.
- You can also use the Tavily server for web search, or any other MCP server you have configured.
- The MCP Pipe will automatically discover available tools and prompts from all configured servers.
- Python servers only: This pipe does not support NPX or non-Python MCP servers. For NPX support, see the advanced MCP Pipeline below.
- Server not found: Make sure the MCP server is installed and accessible in the Python environment used by Open WebUI
-
Config file not loaded: Double-check the location (
/data/config.json
) and syntax of your config file - API key issues: Ensure all required API keys (e.g., Tavily, OpenAI) are set correctly in the config and valves
- Advanced features: For more advanced MCP features (including NPX server support), see the MCP Pipeline Documentation
- Logs: Check Open WebUI logs for errors related to MCP server startup or communication
If you need more advanced features, such as NPX server support, see the documentation in Pipelines/MCP_Pipeline/README_MCP_Pipeline.md
in this repository.
Uses an LLM to automatically improve the quality of your prompts before they are sent to the main language model.
-
user_customizable_template
: Tailor the instructions given to the prompt-enhancing LLM -
show_status
: Displays status updates during the enhancement process -
show_enhanced_prompt
: Outputs the enhanced prompt to the chat window -
model_id
: Select the specific model to use for prompt enhancement
- Enable in your model configuration's filters section.
- The filter will automatically process each user message before it's sent to the main LLM.
Acts as a model router. Analyzes the user's message and available models, then automatically selects the most appropriate model, pipe, or preset for the task.
- Configure banned models, vision model routing, and whether to show the selection reasoning in chat.
- Enable in your model configuration's filters section.
Allows Open WebUI to process entire attached files (such as resumes or documents) as part of the conversation. Cleans and prepends the file content to the first user message, ensuring the LLM receives the full context.
-
priority
(int): Priority level for the filter operations (default:0
) -
max_turns
(int): Maximum allowable conversation turns for a user (default:8
)
-
max_turns
(int): Maximum allowable conversation turns for a user (default:4
)
- Enable the filter in your model configuration.
- When you attach a file in Open WebUI, the filter will automatically clean and inject the file content into your message.
- No manual configuration is needed for most users.
-
Example:
Analyze this resume: [Attach resume file]
Checks if an assistant's message ends with an unclosed or incomplete "thinking" tag. If so, it extracts the unfinished thought and presents it as a user-visible message.
- No configuration required.
- Works automatically when enabled.
- Open ComfyUI.
- Click the "Load Workflow" or "Import" button.
- Select the provided JSON file (e.g.,
ace_step_api.json
orflux_context_owui_api_v1.json
). - Save or modify as needed.
- Use the node numbers in your Open WebUI tool configuration.
- Always check node numbers after importing, as they may change if you modify the workflow.
- You can create and share your own workflows by exporting them from ComfyUI.
This approach allows you to leverage state-of-the-art image and music generation/editing models with full control and customization, directly from Open WebUI.
- Visit https://openwebui.com/u/haervwe
- Click "Get" for desired tool/pipe/filter.
- Follow prompts in your Open WebUI instance.
- Copy
.py
files fromtools/
,functions/
, orfilters/
into Open WebUI via the Workspace > Tools/Functions/Filters section. - Provide a name and description, then save.
Feel free to contribute to this project by:
- Forking the repository
- Creating your feature branch
- Committing your changes
- Opening a pull request
MIT License
- Developed by Haervwe
- Credit to the amazing teams behind:
- And all model trainers out there providing these amazing tools.
# Search for recent papers on a topic
Search for recent papers about "large language model training"
# Conduct comprehensive research
Do a research summary on "DPO laser LLM training"
# Generate images
Create an image of "beautiful horse running free"
# Create music
Generate a song in the style of "funk, pop, soul" with lyrics: "In the shadows where secrets hide..."
# Edit images
Edit this image to look like a medieval fantasy king, preserving facial features
# Analyze documents
Analyze this resume: [Attach resume file]
# Plan complex tasks
Create a fully-featured Single Page Application (SPA) for Conway's Game of Life
# Start group discussions
Start a conversation between three AI agents about climate change
This collection is part of the broader Open WebUI ecosystem. Here's how you can get involved:
- ๐ Open WebUI Hub: Discover more tools at openwebui.com
- ๐ Documentation: Learn more about Open WebUI at docs.openwebui.com
- ๐ก Ideas: Share your ideas and feature requests
- ๐ Bug Reports: Help improve the tools by reporting issues
- ๐ Star the Repository: Show your support by starring this repo
For issues, questions, or suggestions, please open an issue on the GitHub repository.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for open-webui-tools
Similar Open Source Tools

open-webui-tools
Open WebUI Tools Collection is a set of tools for structured planning, arXiv paper search, Hugging Face text-to-image generation, prompt enhancement, and multi-model conversations. It enhances LLM interactions with academic research, image generation, and conversation management. Tools include arXiv Search Tool and Hugging Face Image Generator. Function Pipes like Planner Agent offer autonomous plan generation and execution. Filters like Prompt Enhancer improve prompt quality. Installation and configuration instructions are provided for each tool and pipe.

ml-retreat
ML-Retreat is a comprehensive machine learning library designed to simplify and streamline the process of building and deploying machine learning models. It provides a wide range of tools and utilities for data preprocessing, model training, evaluation, and deployment. With ML-Retreat, users can easily experiment with different algorithms, hyperparameters, and feature engineering techniques to optimize their models. The library is built with a focus on scalability, performance, and ease of use, making it suitable for both beginners and experienced machine learning practitioners.

tools
Strands Agents Tools is a community-driven project that provides a powerful set of tools for your agents to use. It bridges the gap between large language models and practical applications by offering ready-to-use tools for file operations, system execution, API interactions, mathematical operations, and more. The tools cover a wide range of functionalities including file operations, shell integration, memory storage, web infrastructure, HTTP client, Slack client, Python execution, mathematical tools, AWS integration, image and video processing, audio output, environment management, task scheduling, advanced reasoning, swarm intelligence, dynamic MCP client, parallel tool execution, browser automation, diagram creation, RSS feed management, and computer automation.

deeppowers
Deeppowers is a powerful Python library for deep learning applications. It provides a wide range of tools and utilities to simplify the process of building and training deep neural networks. With Deeppowers, users can easily create complex neural network architectures, perform efficient training and optimization, and deploy models for various tasks. The library is designed to be user-friendly and flexible, making it suitable for both beginners and experienced deep learning practitioners.

uwazi
Uwazi is a flexible database application designed for capturing and organizing collections of information, with a focus on document management. It is developed and supported by HURIDOCS, benefiting human rights organizations globally. The tool requires NodeJs, ElasticSearch, ICU Analysis Plugin, MongoDB, Yarn, and pdftotext for installation. It offers production and development installation guides, including Docker setup. Uwazi supports hot reloading, unit and integration testing with JEST, and end-to-end testing with Nightmare or Puppeteer. The system requirements include RAM, CPU, and disk space recommendations for on-premises and development usage.

LLMs-playground
LLMs-playground is a repository containing code examples and tutorials for learning and experimenting with Large Language Models (LLMs). It provides a hands-on approach to understanding how LLMs work and how to fine-tune them for specific tasks. The repository covers various LLM architectures, pre-training techniques, and fine-tuning strategies, making it a valuable resource for researchers, students, and practitioners interested in natural language processing and machine learning. By exploring the code and following the tutorials, users can gain practical insights into working with LLMs and apply their knowledge to real-world projects.

baibot
Baibot is a versatile chatbot framework designed to simplify the process of creating and deploying chatbots. It provides a user-friendly interface for building custom chatbots with various functionalities such as natural language processing, conversation flow management, and integration with external APIs. Baibot is highly customizable and can be easily extended to suit different use cases and industries. With Baibot, developers can quickly create intelligent chatbots that can interact with users in a seamless and engaging manner, enhancing user experience and automating customer support processes.

Memento
Memento is a lightweight and user-friendly version control tool designed for small to medium-sized projects. It provides a simple and intuitive interface for managing project versions and collaborating with team members. With Memento, users can easily track changes, revert to previous versions, and merge different branches. The tool is suitable for developers, designers, content creators, and other professionals who need a streamlined version control solution. Memento simplifies the process of managing project history and ensures that team members are always working on the latest version of the project.

verl-tool
The verl-tool is a versatile command-line utility designed to streamline various tasks related to version control and code management. It provides a simple yet powerful interface for managing branches, merging changes, resolving conflicts, and more. With verl-tool, users can easily track changes, collaborate with team members, and ensure code quality throughout the development process. Whether you are a beginner or an experienced developer, verl-tool offers a seamless experience for version control operations.

req_llm
ReqLLM is a Req-based library for LLM interactions, offering a unified interface to AI providers through a plugin-based architecture. It brings composability and middleware advantages to LLM interactions, with features like auto-synced providers/models, typed data structures, ergonomic helpers, streaming capabilities, usage & cost extraction, and a plugin-based provider system. Users can easily generate text, structured data, embeddings, and track usage costs. The tool supports various AI providers like Anthropic, OpenAI, Groq, Google, and xAI, and allows for easy addition of new providers. ReqLLM also provides API key management, detailed documentation, and a roadmap for future enhancements.

context-portal
Context-portal is a versatile tool for managing and visualizing data in a collaborative environment. It provides a user-friendly interface for organizing and sharing information, making it easy for teams to work together on projects. With features such as customizable dashboards, real-time updates, and seamless integration with popular data sources, Context-portal streamlines the data management process and enhances productivity. Whether you are a data analyst, project manager, or team leader, Context-portal offers a comprehensive solution for optimizing workflows and driving better decision-making.

deepflow
DeepFlow is an open-source project that provides deep observability for complex cloud-native and AI applications. It offers Zero Code data collection with eBPF for metrics, distributed tracing, request logs, and function profiling. DeepFlow is integrated with SmartEncoding to achieve Full Stack correlation and efficient access to all observability data. With DeepFlow, cloud-native and AI applications automatically gain deep observability, removing the burden of developers continually instrumenting code and providing monitoring and diagnostic capabilities covering everything from code to infrastructure for DevOps/SRE teams.

agentic
Agentic is a lightweight and flexible Python library for building multi-agent systems. It provides a simple and intuitive API for creating and managing agents, defining their behaviors, and simulating interactions in a multi-agent environment. With Agentic, users can easily design and implement complex agent-based models to study emergent behaviors, social dynamics, and decentralized decision-making processes. The library supports various agent architectures, communication protocols, and simulation scenarios, making it suitable for a wide range of research and educational applications in the fields of artificial intelligence, machine learning, social sciences, and robotics.

Fast-LLM
Fast-LLM is an open-source library designed for training large language models with exceptional speed, scalability, and flexibility. Built on PyTorch and Triton, it offers optimized kernel efficiency, reduced overheads, and memory usage, making it suitable for training models of all sizes. The library supports distributed training across multiple GPUs and nodes, offers flexibility in model architectures, and is easy to use with pre-built Docker images and simple configuration. Fast-LLM is licensed under Apache 2.0, developed transparently on GitHub, and encourages contributions and collaboration from the community.

evaluation-guidebook
The LLM Evaluation guidebook provides comprehensive guidance on evaluating language model performance, including different evaluation methods, designing evaluations, and practical tips. It caters to both beginners and advanced users, offering insights on model inference, tokenization, and troubleshooting. The guide covers automatic benchmarks, human evaluation, LLM-as-a-judge scenarios, troubleshooting practicalities, and general knowledge on LLM basics. It also includes planned articles on automated benchmarks, evaluation importance, task-building considerations, and model comparison challenges. The resource is enriched with recommended links and acknowledgments to contributors and inspirations.

RAG-To-Know
RAG-To-Know is a versatile tool for knowledge extraction and summarization. It leverages the RAG (Retrieval-Augmented Generation) framework to provide a seamless way to retrieve and summarize information from various sources. With RAG-To-Know, users can easily extract key insights and generate concise summaries from large volumes of text data. The tool is designed to streamline the process of information retrieval and summarization, making it ideal for researchers, students, journalists, and anyone looking to quickly grasp the essence of complex information.
For similar tasks

AI-Powered-Resume-Analyzer-and-LinkedIn-Scraper-with-Selenium
Resume Analyzer AI is an advanced Streamlit application that specializes in thorough resume analysis. It excels at summarizing resumes, evaluating strengths, identifying weaknesses, and offering personalized improvement suggestions. It also recommends job titles and uses Selenium to extract vital LinkedIn data. The tool simplifies the job-seeking journey by providing comprehensive insights to elevate career opportunities.

AI-Resume-Analyzer-and-LinkedIn-Scraper-using-LLM
Developed an advanced AI application that utilizes LLM and OpenAI for comprehensive resume analysis. It excels at summarizing the resume, evaluating strengths, identifying weaknesses, and offering personalized improvement suggestions, while also recommending the perfect job titles. Additionally, it seamlessly employs Selenium to extract vital LinkedIn data, encompassing company names, job titles, locations, job URLs, and detailed job descriptions. This application simplifies the job-seeking journey by equipping users with comprehensive insights to elevate their career opportunities.

AI-Resume-Analyzer-and-LinkedIn-Scraper-using-Generative-AI
Developed an advanced AI application that utilizes LLM and OpenAI for comprehensive resume analysis. It excels at summarizing the resume, evaluating strengths, identifying weaknesses, and offering personalized improvement suggestions, while also recommending the perfect job titles. Additionally, it seamlessly employs Selenium to extract vital LinkedIn data, encompassing company names, job titles, locations, job URLs, and detailed job descriptions. This application simplifies the job-seeking journey by equipping users with comprehensive insights to elevate their career opportunities.

open-webui-tools
Open WebUI Tools Collection is a set of tools for structured planning, arXiv paper search, Hugging Face text-to-image generation, prompt enhancement, and multi-model conversations. It enhances LLM interactions with academic research, image generation, and conversation management. Tools include arXiv Search Tool and Hugging Face Image Generator. Function Pipes like Planner Agent offer autonomous plan generation and execution. Filters like Prompt Enhancer improve prompt quality. Installation and configuration instructions are provided for each tool and pipe.

lollms-webui
LoLLMs WebUI (Lord of Large Language Multimodal Systems: One tool to rule them all) is a user-friendly interface to access and utilize various LLM (Large Language Models) and other AI models for a wide range of tasks. With over 500 AI expert conditionings across diverse domains and more than 2500 fine tuned models over multiple domains, LoLLMs WebUI provides an immediate resource for any problem, from car repair to coding assistance, legal matters, medical diagnosis, entertainment, and more. The easy-to-use UI with light and dark mode options, integration with GitHub repository, support for different personalities, and features like thumb up/down rating, copy, edit, and remove messages, local database storage, search, export, and delete multiple discussions, make LoLLMs WebUI a powerful and versatile tool.

daily-poetry-image
Daily Chinese ancient poetry and AI-generated images powered by Bing DALL-E-3. GitHub Action triggers the process automatically. Poetry is provided by Today's Poem API. The website is built with Astro.

InvokeAI
InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products.

LocalAI
LocalAI is a free and open-source OpenAI alternative that acts as a drop-in replacement REST API compatible with OpenAI (Elevenlabs, Anthropic, etc.) API specifications for local AI inferencing. It allows users to run LLMs, generate images, audio, and more locally or on-premises with consumer-grade hardware, supporting multiple model families and not requiring a GPU. LocalAI offers features such as text generation with GPTs, text-to-audio, audio-to-text transcription, image generation with stable diffusion, OpenAI functions, embeddings generation for vector databases, constrained grammars, downloading models directly from Huggingface, and a Vision API. It provides a detailed step-by-step introduction in its Getting Started guide and supports community integrations such as custom containers, WebUIs, model galleries, and various bots for Discord, Slack, and Telegram. LocalAI also offers resources like an LLM fine-tuning guide, instructions for local building and Kubernetes installation, projects integrating LocalAI, and a how-tos section curated by the community. It encourages users to cite the repository when utilizing it in downstream projects and acknowledges the contributions of various software from the community.
For similar jobs

Perplexica
Perplexica is an open-source AI-powered search engine that utilizes advanced machine learning algorithms to provide clear answers with sources cited. It offers various modes like Copilot Mode, Normal Mode, and Focus Modes for specific types of questions. Perplexica ensures up-to-date information by using SearxNG metasearch engine. It also features image and video search capabilities and upcoming features include finalizing Copilot Mode and adding Discover and History Saving features.

KULLM
KULLM (๊ตฌ๋ฆ) is a Korean Large Language Model developed by Korea University NLP & AI Lab and HIAI Research Institute. It is based on the upstage/SOLAR-10.7B-v1.0 model and has been fine-tuned for instruction. The model has been trained on 8รA100 GPUs and is capable of generating responses in Korean language. KULLM exhibits hallucination and repetition phenomena due to its decoding strategy. Users should be cautious as the model may produce inaccurate or harmful results. Performance may vary in benchmarks without a fixed system prompt.

MMMU
MMMU is a benchmark designed to evaluate multimodal models on college-level subject knowledge tasks, covering 30 subjects and 183 subfields with 11.5K questions. It focuses on advanced perception and reasoning with domain-specific knowledge, challenging models to perform tasks akin to those faced by experts. The evaluation of various models highlights substantial challenges, with room for improvement to stimulate the community towards expert artificial general intelligence (AGI).

1filellm
1filellm is a command-line data aggregation tool designed for LLM ingestion. It aggregates and preprocesses data from various sources into a single text file, facilitating the creation of information-dense prompts for large language models. The tool supports automatic source type detection, handling of multiple file formats, web crawling functionality, integration with Sci-Hub for research paper downloads, text preprocessing, and token count reporting. Users can input local files, directories, GitHub repositories, pull requests, issues, ArXiv papers, YouTube transcripts, web pages, Sci-Hub papers via DOI or PMID. The tool provides uncompressed and compressed text outputs, with the uncompressed text automatically copied to the clipboard for easy pasting into LLMs.

gpt-researcher
GPT Researcher is an autonomous agent designed for comprehensive online research on a variety of tasks. It can produce detailed, factual, and unbiased research reports with customization options. The tool addresses issues of speed, determinism, and reliability by leveraging parallelized agent work. The main idea involves running 'planner' and 'execution' agents to generate research questions, seek related information, and create research reports. GPT Researcher optimizes costs and completes tasks in around 3 minutes. Features include generating long research reports, aggregating web sources, an easy-to-use web interface, scraping web sources, and exporting reports to various formats.

ChatTTS
ChatTTS is a generative speech model optimized for dialogue scenarios, providing natural and expressive speech synthesis with fine-grained control over prosodic features. It supports multiple speakers and surpasses most open-source TTS models in terms of prosody. The model is trained with 100,000+ hours of Chinese and English audio data, and the open-source version on HuggingFace is a 40,000-hour pre-trained model without SFT. The roadmap includes open-sourcing additional features like VQ encoder, multi-emotion control, and streaming audio generation. The tool is intended for academic and research use only, with precautions taken to limit potential misuse.

HebTTS
HebTTS is a language modeling approach to diacritic-free Hebrew text-to-speech (TTS) system. It addresses the challenge of accurately mapping text to speech in Hebrew by proposing a language model that operates on discrete speech representations and is conditioned on a word-piece tokenizer. The system is optimized using weakly supervised recordings and outperforms diacritic-based Hebrew TTS systems in terms of content preservation and naturalness of generated speech.

do-research-in-AI
This repository is a collection of research lectures and experience sharing posts from frontline researchers in the field of AI. It aims to help individuals upgrade their research skills and knowledge through insightful talks and experiences shared by experts. The content covers various topics such as evaluating research papers, choosing research directions, research methodologies, and tips for writing high-quality scientific papers. The repository also includes discussions on academic career paths, research ethics, and the emotional aspects of research work. Overall, it serves as a valuable resource for individuals interested in advancing their research capabilities in the field of AI.