
mcp-omnisearch
🔍 A Model Context Protocol (MCP) server providing unified access to multiple search engines (Tavily, Brave, Kagi), AI tools (Perplexity, FastGPT), and content processing services (Jina AI, Kagi). Combines search, AI responses, content processing, and enhancement features through a single interface.
Stars: 189

mcp-omnisearch is a Model Context Protocol (MCP) server that acts as a unified gateway to multiple search providers and AI tools. It integrates Tavily, Perplexity, Kagi, Jina AI, Brave, Exa AI, and Firecrawl to offer a wide range of search, AI response, content processing, and enhancement features through a single interface. The server provides powerful search capabilities, AI response generation, content extraction, summarization, web scraping, structured data extraction, and more. It is designed to work flexibly with the API keys available, enabling users to activate only the providers they have keys for and easily add more as needed.
README:
A Model Context Protocol (MCP) server that provides unified access to multiple search providers and AI tools. This server combines the capabilities of Tavily, Perplexity, Kagi, Jina AI, Brave, Exa AI, and Firecrawl to offer comprehensive search, AI responses, content processing, and enhancement features through a single interface.
- Tavily Search: Optimized for factual information with strong citation support. Supports domain filtering through API parameters (include_domains/exclude_domains).
- Brave Search: Privacy-focused search with good technical content coverage. Features native support for search operators (site:, -site:, filetype:, intitle:, inurl:, before:, after:, and exact phrases).
- Kagi Search: High-quality search results with minimal advertising influence, focused on authoritative sources. Supports search operators in query string (site:, -site:, filetype:, intitle:, inurl:, before:, after:, and exact phrases).
- Exa Search: AI-powered web search using neural and keyword search. Optimized for AI applications with semantic understanding, content extraction, and research capabilities.
-
GitHub Search: Comprehensive code search across public GitHub
repositories with three specialized tools:
-
Code Search: Find code examples, function definitions, and
files using advanced syntax (
filename:
,path:
,repo:
,user:
,language:
,in:file
) - Repository Search: Discover repositories with sorting by stars, forks, or recent updates
- User Search: Find GitHub users and organizations
-
Code Search: Find code examples, function definitions, and
files using advanced syntax (
MCP Omnisearch provides powerful search capabilities through operators and parameters:
- Domain filtering: Available across all providers
- Tavily: Through API parameters (include_domains/exclude_domains)
- Brave & Kagi: Through site: and -site: operators
- File type filtering: Available in Brave and Kagi (filetype:)
- Title and URL filtering: Available in Brave and Kagi (intitle:, inurl:)
- Date filtering: Available in Brave and Kagi (before:, after:)
- Exact phrase matching: Available in Brave and Kagi ("phrase")
// Using Brave or Kagi with query string operators
{
"query": "filetype:pdf site:microsoft.com typescript guide"
}
// Using Tavily with API parameters
{
"query": "typescript guide",
"include_domains": ["microsoft.com"],
"exclude_domains": ["github.com"]
}
- Brave Search: Full native operator support in query string
- Kagi Search: Complete operator support in query string
- Tavily Search: Domain filtering through API parameters
- Exa Search: Domain filtering through API parameters, semantic search with neural understanding
-
GitHub Search: Advanced code search syntax with qualifiers:
-
filename:remote.ts
- Search for specific files -
path:src/lib
- Search within specific directories -
repo:user/repo
- Search within specific repositories -
user:username
- Search within a user's repositories -
language:typescript
- Filter by programming language -
in:file "export function"
- Search for text within files
-
- Perplexity AI: Advanced response generation combining real-time web search with GPT-4 Omni and Claude 3
- Kagi FastGPT: Quick AI-generated answers with citations (900ms typical response time)
- Exa Answer: Get direct AI-generated answers to questions using Exa Answer API
- Jina AI Reader: Clean content extraction with image captioning and PDF support
- Kagi Universal Summarizer: Content summarization for pages, videos, and podcasts
- Tavily Extract: Extract raw content from single or multiple web pages with configurable extraction depth ('basic' or 'advanced'). Returns both combined content and individual URL content, with metadata including word count and extraction statistics
- Firecrawl Scrape: Extract clean, LLM-ready data from single URLs with enhanced formatting options
- Firecrawl Crawl: Deep crawling of all accessible subpages on a website with configurable depth limits
- Firecrawl Map: Fast URL collection from websites for comprehensive site mapping
- Firecrawl Extract: Structured data extraction with AI using natural language prompts
- Firecrawl Actions: Support for page interactions (clicking, scrolling, etc.) before extraction for dynamic content
- Exa Contents: Extract full content from Exa search result IDs
- Exa Similar: Find web pages semantically similar to a given URL using Exa
- Kagi Enrichment API: Supplementary content from specialized indexes (Teclis, TinyGem)
- Jina AI Grounding: Real-time fact verification against web knowledge
MCP Omnisearch is designed to work with the API keys you have available. You don't need to have keys for all providers - the server will automatically detect which API keys are available and only enable those providers.
For example:
- If you only have a Tavily and Perplexity API key, only those providers will be available
- If you don't have a Kagi API key, Kagi-based services won't be available, but all other providers will work normally
- The server will log which providers are available based on the API keys you've configured
This flexibility makes it easy to get started with just one or two providers and add more as needed.
This server requires configuration through your MCP client. Here are examples for different environments:
Add this to your Cline MCP settings:
{
"mcpServers": {
"mcp-omnisearch": {
"command": "node",
"args": ["/path/to/mcp-omnisearch/dist/index.js"],
"env": {
"TAVILY_API_KEY": "your-tavily-key",
"PERPLEXITY_API_KEY": "your-perplexity-key",
"KAGI_API_KEY": "your-kagi-key",
"JINA_AI_API_KEY": "your-jina-key",
"BRAVE_API_KEY": "your-brave-key",
"GITHUB_API_KEY": "your-github-key",
"EXA_API_KEY": "your-exa-key",
"FIRECRAWL_API_KEY": "your-firecrawl-key",
"FIRECRAWL_BASE_URL": "http://localhost:3002"
},
"disabled": false,
"autoApprove": []
}
}
}
For WSL environments, add this to your Claude Desktop configuration:
{
"mcpServers": {
"mcp-omnisearch": {
"command": "wsl.exe",
"args": [
"bash",
"-c",
"TAVILY_API_KEY=key1 PERPLEXITY_API_KEY=key2 KAGI_API_KEY=key3 JINA_AI_API_KEY=key4 BRAVE_API_KEY=key5 GITHUB_API_KEY=key6 EXA_API_KEY=key7 FIRECRAWL_API_KEY=key8 FIRECRAWL_BASE_URL=http://localhost:3002 node /path/to/mcp-omnisearch/dist/index.js"
]
}
}
}
The server uses API keys for each provider. You don't need keys for all providers - only the providers corresponding to your available API keys will be activated:
-
TAVILY_API_KEY
: For Tavily Search -
PERPLEXITY_API_KEY
: For Perplexity AI -
KAGI_API_KEY
: For Kagi services (FastGPT, Summarizer, Enrichment) -
JINA_AI_API_KEY
: For Jina AI services (Reader, Grounding) -
BRAVE_API_KEY
: For Brave Search -
GITHUB_API_KEY
: For GitHub search services (Code, Repository, User search) -
EXA_API_KEY
: For Exa AI services (Search, Answer, Contents, Similar) -
FIRECRAWL_API_KEY
: For Firecrawl services (Scrape, Crawl, Map, Extract, Actions) -
FIRECRAWL_BASE_URL
: For self-hosted Firecrawl instances (optional, defaults to Firecrawl cloud service)
You can start with just one or two API keys and add more later as needed. The server will log which providers are available on startup.
To use GitHub search features, you'll need a GitHub personal access token with public repository access only for security:
-
Go to GitHub Settings: Navigate to GitHub Settings > Developer settings > Personal access tokens
-
Create a new token: Click "Generate new token" → "Generate new token (classic)"
-
Configure token settings:
-
Name:
MCP Omnisearch - Public Search
-
Expiration: Choose your preferred expiration (90 days recommended)
-
Scopes: Leave all checkboxes UNCHECKED
⚠️ Important: Do not select any scopes. An empty scope token can only access public repositories and user profiles, which is exactly what we want for search functionality.
-
-
Generate and copy: Click "Generate token" and copy the token immediately
-
Add to environment: Set
GITHUB_API_KEY=your_token_here
Security Notes:
- This token configuration ensures no access to private repositories
- Only public code search, repository discovery, and user profiles are accessible
- Rate limits: 5,000 requests/hour for code search, 10 requests/minute for code search specifically
- You can revoke the token anytime from GitHub settings if needed
If you're running a self-hosted instance of Firecrawl, you can
configure MCP Omnisearch to use it by setting the FIRECRAWL_BASE_URL
environment variable. This allows you to maintain complete control
over your data processing pipeline.
Self-hosted Firecrawl setup:
- Follow the Firecrawl self-hosting guide
- Set up your Firecrawl instance (default runs on
http://localhost:3002
) - Configure MCP Omnisearch with your self-hosted URL:
FIRECRAWL_BASE_URL=http://localhost:3002
# or for a remote self-hosted instance:
FIRECRAWL_BASE_URL=https://your-firecrawl-domain.com
Important notes:
- If
FIRECRAWL_BASE_URL
is not set, MCP Omnisearch will default to the Firecrawl cloud service - Self-hosted instances support the same API endpoints (
/v1/scrape
,/v1/crawl
, etc.) - You'll still need a
FIRECRAWL_API_KEY
even for self-hosted instances - Self-hosted Firecrawl provides enhanced security and customization options
The server implements MCP Tools organized by category:
Search the web using Tavily Search API. Best for factual queries requiring reliable sources and citations.
Parameters:
-
query
(string, required): Search query
Example:
{
"query": "latest developments in quantum computing"
}
Privacy-focused web search with good coverage of technical topics.
Parameters:
-
query
(string, required): Search query
Example:
{
"query": "rust programming language features"
}
High-quality search results with minimal advertising influence. Best for finding authoritative sources and research materials.
Parameters:
-
query
(string, required): Search query -
language
(string, optional): Language filter (e.g., "en") -
no_cache
(boolean, optional): Bypass cache for fresh results
Example:
{
"query": "latest research in machine learning",
"language": "en"
}
Search for code on GitHub using advanced syntax. This tool searches through file contents in public repositories and provides code snippets with metadata.
Parameters:
-
query
(string, required): Search query with GitHub search syntax -
limit
(number, optional): Maximum number of results (1-50, default: 10)
Example:
{
"query": "filename:remote.ts @sveltejs/kit",
"limit": 5
}
Advanced query examples:
-
"filename:config.json path:src"
- Find config.json files in src directories -
"function fetchData language:typescript"
- Find fetchData functions in TypeScript -
"repo:microsoft/vscode extension"
- Search within specific repository -
"user:torvalds language:c"
- Search user's repositories for C code
Discover GitHub repositories with enhanced metadata including stars, forks, language, and last update information.
Parameters:
-
query
(string, required): Repository search query -
limit
(number, optional): Maximum number of results (1-50, default: 10) -
sort
(string, optional): Sort results by 'stars', 'forks', or 'updated'
Example:
{
"query": "sveltekit remote functions",
"sort": "stars",
"limit": 5
}
Find GitHub users and organizations with profile information.
Parameters:
-
query
(string, required): User/organization search query -
limit
(number, optional): Maximum number of results (1-50, default: 10)
Example:
{
"query": "Rich-Harris",
"limit": 3
}
AI-powered web search using neural and keyword search. Automatically chooses between traditional keyword search and Exa's embeddings-based model to find the most relevant results for your query.
Parameters:
-
query
(string, required): Search query -
limit
(number, optional): Maximum number of results (1-100, default: 10) -
include_domains
(array, optional): Only include results from these domains -
exclude_domains
(array, optional): Exclude results from these domains
Example:
{
"query": "latest AI research papers",
"limit": 15,
"include_domains": ["arxiv.org", "scholar.google.com"]
}
AI-powered response generation with real-time web search integration.
Parameters:
-
query
(string, required): Question or topic for AI response
Example:
{
"query": "Explain the differences between REST and GraphQL"
}
Quick AI-generated answers with citations.
Parameters:
-
query
(string, required): Question for quick AI response
Example:
{
"query": "What are the main features of TypeScript?"
}
Get direct AI-generated answers to questions using Exa Answer API.
Parameters:
-
query
(string, required): Question for AI response -
include_domains
(array, optional): Only include sources from these domains -
exclude_domains
(array, optional): Exclude sources from these domains
Example:
{
"query": "How does machine learning work?",
"include_domains": ["arxiv.org", "nature.com"]
}
Convert URLs to clean, LLM-friendly text with image captioning.
Parameters:
-
url
(string, required): URL to process
Example:
{
"url": "https://example.com/article"
}
Summarize content from URLs.
Parameters:
-
url
(string, required): URL to summarize
Example:
{
"url": "https://example.com/long-article"
}
Extract raw content from web pages with Tavily Extract.
Parameters:
-
url
(string | string[], required): Single URL or array of URLs to extract content from -
extract_depth
(string, optional): Extraction depth - 'basic' (default) or 'advanced'
Example:
{
"url": [
"https://example.com/article1",
"https://example.com/article2"
],
"extract_depth": "advanced"
}
Response includes:
- Combined content from all URLs
- Individual raw content for each URL
- Metadata with word count, successful extractions, and any failed URLs
Extract clean, LLM-ready data from single URLs with enhanced formatting options.
Parameters:
-
url
(string | string[], required): Single URL or array of URLs to extract content from -
extract_depth
(string, optional): Extraction depth - 'basic' (default) or 'advanced'
Example:
{
"url": "https://example.com/article",
"extract_depth": "basic"
}
Response includes:
- Clean, markdown-formatted content
- Metadata including title, word count, and extraction statistics
Deep crawling of all accessible subpages on a website with configurable depth limits.
Parameters:
-
url
(string | string[], required): Starting URL for crawling -
extract_depth
(string, optional): Extraction depth - 'basic' (default) or 'advanced' (controls crawl depth and limits)
Example:
{
"url": "https://example.com",
"extract_depth": "advanced"
}
Response includes:
- Combined content from all crawled pages
- Individual content for each page
- Metadata including title, word count, and crawl statistics
Fast URL collection from websites for comprehensive site mapping.
Parameters:
-
url
(string | string[], required): URL to map -
extract_depth
(string, optional): Extraction depth - 'basic' (default) or 'advanced' (controls map depth)
Example:
{
"url": "https://example.com",
"extract_depth": "basic"
}
Response includes:
- List of all discovered URLs
- Metadata including site title and URL count
Structured data extraction with AI using natural language prompts.
Parameters:
-
url
(string | string[], required): URL to extract structured data from -
extract_depth
(string, optional): Extraction depth - 'basic' (default) or 'advanced'
Example:
{
"url": "https://example.com",
"extract_depth": "basic"
}
Response includes:
- Structured data extracted from the page
- Metadata including title, extraction statistics
Support for page interactions (clicking, scrolling, etc.) before extraction for dynamic content.
Parameters:
-
url
(string | string[], required): URL to interact with and extract content from -
extract_depth
(string, optional): Extraction depth - 'basic' (default) or 'advanced' (controls complexity of interactions)
Example:
{
"url": "https://news.ycombinator.com",
"extract_depth": "basic"
}
Response includes:
- Content extracted after performing interactions
- Description of actions performed
- Screenshot of the page (if available)
- Metadata including title and extraction statistics
Extract full content from Exa search result IDs.
Parameters:
-
ids
(string | string[], required): Exa search result ID(s) to extract content from -
extract_depth
(string, optional): Extraction depth - 'basic' (default) or 'advanced'
Example:
{
"ids": ["exa-result-id-123", "exa-result-id-456"],
"extract_depth": "advanced"
}
Response includes:
- Combined content from all result IDs
- Individual raw content for each ID
- Metadata with word count and extraction statistics
Find web pages semantically similar to a given URL using Exa.
Parameters:
-
url
(string, required): URL to find similar pages for -
extract_depth
(string, optional): Extraction depth - 'basic' (default) or 'advanced'
Example:
{
"url": "https://arxiv.org/abs/2106.09685",
"extract_depth": "advanced"
}
Response includes:
- Combined content from all similar pages
- Similarity scores and metadata
- Individual content for each similar page
Get supplementary content from specialized indexes.
Parameters:
-
query
(string, required): Query for enrichment
Example:
{
"query": "emerging web technologies"
}
Verify statements against web knowledge.
Parameters:
-
statement
(string, required): Statement to verify
Example:
{
"statement": "TypeScript adds static typing to JavaScript"
}
MCP Omnisearch supports containerized deployment using Docker with MCPO (Model Context Protocol Over HTTP) integration, enabling cloud deployment and OpenAPI access.
- Using Docker Compose (Recommended):
# Clone the repository
git clone https://github.com/spences10/mcp-omnisearch.git
cd mcp-omnisearch
# Create .env file with your API keys
echo "TAVILY_API_KEY=your-tavily-key" > .env
echo "KAGI_API_KEY=your-kagi-key" >> .env
echo "PERPLEXITY_API_KEY=your-perplexity-key" >> .env
echo "EXA_API_KEY=your-exa-key" >> .env
# Add other API keys as needed
echo "GITHUB_API_KEY=your-github-key" >> .env
# Start the container
docker-compose up -d
- Using Docker directly:
docker build -t mcp-omnisearch .
docker run -d \
-p 8000:8000 \
-e TAVILY_API_KEY=your-tavily-key \
-e KAGI_API_KEY=your-kagi-key \
-e PERPLEXITY_API_KEY=your-perplexity-key \
-e EXA_API_KEY=your-exa-key \
-e GITHUB_API_KEY=your-github-key \
--name mcp-omnisearch \
mcp-omnisearch
Configure the container using environment variables for each provider:
-
TAVILY_API_KEY
: For Tavily Search -
PERPLEXITY_API_KEY
: For Perplexity AI -
KAGI_API_KEY
: For Kagi services (FastGPT, Summarizer, Enrichment) -
JINA_AI_API_KEY
: For Jina AI services (Reader, Grounding) -
BRAVE_API_KEY
: For Brave Search -
GITHUB_API_KEY
: For GitHub search services -
EXA_API_KEY
: For Exa AI services -
FIRECRAWL_API_KEY
: For Firecrawl services -
FIRECRAWL_BASE_URL
: For self-hosted Firecrawl instances (optional) -
PORT
: Container port (defaults to 8000)
Once deployed, the MCP server is accessible via OpenAPI at:
-
Base URL:
http://your-container-host:8000
-
OpenAPI Endpoint:
/omnisearch
- Compatible with: OpenWebUI and other tools expecting OpenAPI
The containerized version can be deployed to any container platform that supports Docker:
- Cloud Run (Google Cloud)
- Container Instances (Azure)
- ECS/Fargate (AWS)
- Railway, Render, Fly.io
- Any Kubernetes cluster
Example deployment to a cloud platform:
# Build and tag for your registry
docker build -t your-registry/mcp-omnisearch:latest .
docker push your-registry/mcp-omnisearch:latest
# Deploy with your platform's CLI or web interface
# Configure environment variables through your platform's settings
- Clone the repository
- Install dependencies:
pnpm install
- Build the project:
pnpm run build
- Run in development mode:
pnpm run dev
- Update version in package.json
- Build the project:
pnpm run build
- Publish to npm:
pnpm publish
Each provider requires its own API key and may have different access requirements:
- Tavily: Requires an API key from their developer portal
- Perplexity: API access through their developer program
- Kagi: Some features limited to Business (Team) plan users
- Jina AI: API key required for all services
- Brave: API key from their developer portal
- GitHub: Personal access token with no scopes selected (public access only)
- Exa AI: API key from their dashboard at dashboard.exa.ai
- Firecrawl: API key required from their developer portal
Each provider has its own rate limits. The server will handle rate limit errors gracefully and return appropriate error messages.
Contributions are welcome! Please feel free to submit a Pull Request.
MIT License - see the LICENSE file for details.
Built on:
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for mcp-omnisearch
Similar Open Source Tools

mcp-omnisearch
mcp-omnisearch is a Model Context Protocol (MCP) server that acts as a unified gateway to multiple search providers and AI tools. It integrates Tavily, Perplexity, Kagi, Jina AI, Brave, Exa AI, and Firecrawl to offer a wide range of search, AI response, content processing, and enhancement features through a single interface. The server provides powerful search capabilities, AI response generation, content extraction, summarization, web scraping, structured data extraction, and more. It is designed to work flexibly with the API keys available, enabling users to activate only the providers they have keys for and easily add more as needed.

code_puppy
Code Puppy is an AI-powered code generation agent designed to understand programming tasks, generate high-quality code, and explain its reasoning. It supports multi-language code generation, interactive CLI, and detailed code explanations. The tool requires Python 3.9+ and API keys for various models like GPT, Google's Gemini, Cerebras, and Claude. It also integrates with MCP servers for advanced features like code search and documentation lookups. Users can create custom JSON agents for specialized tasks and access a variety of tools for file management, code execution, and reasoning sharing.

LightRAG
LightRAG is a repository hosting the code for LightRAG, a system that supports seamless integration of custom knowledge graphs, Oracle Database 23ai, Neo4J for storage, and multiple file types. It includes features like entity deletion, batch insert, incremental insert, and graph visualization. LightRAG provides an API server implementation for RESTful API access to RAG operations, allowing users to interact with it through HTTP requests. The repository also includes evaluation scripts, code for reproducing results, and a comprehensive code structure.

oxylabs-mcp
The Oxylabs MCP Server acts as a bridge between AI models and the web, providing clean, structured data from any site. It enables scraping of URLs, rendering JavaScript-heavy pages, content extraction for AI use, bypassing anti-scraping measures, and accessing geo-restricted web data from 195+ countries. The implementation utilizes the Model Context Protocol (MCP) to facilitate secure interactions between AI assistants and web content. Key features include scraping content from any site, automatic data cleaning and conversion, bypassing blocks and geo-restrictions, flexible setup with cross-platform support, and built-in error handling and request management.

supergateway
Supergateway is a tool that allows running MCP stdio-based servers over SSE (Server-Sent Events) with one command. It is useful for remote access, debugging, or connecting to SSE-based clients when your MCP server only speaks stdio. The tool supports running in SSE to Stdio mode as well, where it connects to a remote SSE server and exposes a local stdio interface for downstream clients. Supergateway can be used with ngrok to share local MCP servers with remote clients and can also be run in a Docker containerized deployment. It is designed with modularity in mind, ensuring compatibility and ease of use for AI tools exchanging data.

quantalogic
QuantaLogic is a ReAct framework for building advanced AI agents that seamlessly integrates large language models with a robust tool system. It aims to bridge the gap between advanced AI models and practical implementation in business processes by enabling agents to understand, reason about, and execute complex tasks through natural language interaction. The framework includes features such as ReAct Framework, Universal LLM Support, Secure Tool System, Real-time Monitoring, Memory Management, and Enterprise Ready components.

pocketgroq
PocketGroq is a tool that provides advanced functionalities for text generation, web scraping, web search, and AI response evaluation. It includes features like an Autonomous Agent for answering questions, web crawling and scraping capabilities, enhanced web search functionality, and flexible integration with Ollama server. Users can customize the agent's behavior, evaluate responses using AI, and utilize various methods for text generation, conversation management, and Chain of Thought reasoning. The tool offers comprehensive methods for different tasks, such as initializing RAG, error handling, and tool management. PocketGroq is designed to enhance development processes and enable the creation of AI-powered applications with ease.

aider-desk
AiderDesk is a desktop application that enhances coding workflow by leveraging AI capabilities. It offers an intuitive GUI, project management, IDE integration, MCP support, settings management, cost tracking, structured messages, visual file management, model switching, code diff viewer, one-click reverts, and easy sharing. Users can install it by downloading the latest release and running the executable. AiderDesk also supports Python version detection and auto update disabling. It includes features like multiple project management, context file management, model switching, chat mode selection, question answering, cost tracking, MCP server integration, and MCP support for external tools and context. Development setup involves cloning the repository, installing dependencies, running in development mode, and building executables for different platforms. Contributions from the community are welcome following specific guidelines.

LLMVoX
LLMVoX is a lightweight 30M-parameter, LLM-agnostic, autoregressive streaming Text-to-Speech (TTS) system designed to convert text outputs from Large Language Models into high-fidelity streaming speech with low latency. It achieves significantly lower Word Error Rate compared to speech-enabled LLMs while operating at comparable latency and speech quality. Key features include being lightweight & fast with only 30M parameters, LLM-agnostic for easy integration with existing models, multi-queue streaming for continuous speech generation, and multilingual support for easy adaptation to new languages.

AI-Agent-Starter-Kit
AI Agent Starter Kit is a modern full-stack AI-enabled template using Next.js for frontend and Express.js for backend, with Telegram and OpenAI integrations. It offers AI-assisted development, smart environment variable setup assistance, intelligent error resolution, context-aware code completion, and built-in debugging helpers. The kit provides a structured environment for developers to interact with AI tools seamlessly, enhancing the development process and productivity.

arkflow
ArkFlow is a high-performance Rust stream processing engine that seamlessly integrates AI capabilities, providing powerful real-time data processing and intelligent analysis. It supports multiple input/output sources and processors, enabling easy loading and execution of machine learning models for streaming data and inference, anomaly detection, and complex event processing. The tool is built on Rust and Tokio async runtime, offering excellent performance and low latency. It features built-in SQL queries, Python script, JSON processing, Protobuf encoding/decoding, and batch processing capabilities. ArkFlow is extensible with a modular design, making it easy to extend with new components.

nexus
Nexus is a tool that acts as a unified gateway for multiple LLM providers and MCP servers. It allows users to aggregate, govern, and control their AI stack by connecting multiple servers and providers through a single endpoint. Nexus provides features like MCP Server Aggregation, LLM Provider Routing, Context-Aware Tool Search, Protocol Support, Flexible Configuration, Security features, Rate Limiting, and Docker readiness. It supports tool calling, tool discovery, and error handling for STDIO servers. Nexus also integrates with AI assistants, Cursor, Claude Code, and LangChain for seamless usage.
For similar tasks

mcp-omnisearch
mcp-omnisearch is a Model Context Protocol (MCP) server that acts as a unified gateway to multiple search providers and AI tools. It integrates Tavily, Perplexity, Kagi, Jina AI, Brave, Exa AI, and Firecrawl to offer a wide range of search, AI response, content processing, and enhancement features through a single interface. The server provides powerful search capabilities, AI response generation, content extraction, summarization, web scraping, structured data extraction, and more. It is designed to work flexibly with the API keys available, enabling users to activate only the providers they have keys for and easily add more as needed.

1filellm
1filellm is a command-line data aggregation tool designed for LLM ingestion. It aggregates and preprocesses data from various sources into a single text file, facilitating the creation of information-dense prompts for large language models. The tool supports automatic source type detection, handling of multiple file formats, web crawling functionality, integration with Sci-Hub for research paper downloads, text preprocessing, and token count reporting. Users can input local files, directories, GitHub repositories, pull requests, issues, ArXiv papers, YouTube transcripts, web pages, Sci-Hub papers via DOI or PMID. The tool provides uncompressed and compressed text outputs, with the uncompressed text automatically copied to the clipboard for easy pasting into LLMs.

AudioNotes
AudioNotes is a system built on FunASR and Qwen2 that can quickly extract content from audio and video, and organize it using large models into structured markdown notes for easy reading. Users can interact with the audio and video content, install Ollama, pull models, and deploy services using Docker or locally with a PostgreSQL database. The system provides a seamless way to convert audio and video into structured notes for efficient consumption.

dom-to-semantic-markdown
DOM to Semantic Markdown is a tool that converts HTML DOM to Semantic Markdown for use in Large Language Models (LLMs). It maximizes semantic information, token efficiency, and preserves metadata to enhance LLMs' processing capabilities. The tool captures rich web content structure, including semantic tags, image metadata, table structures, and link destinations. It offers customizable conversion options and supports both browser and Node.js environments.

scrape-it-now
Scrape It Now is a versatile tool for scraping websites with features like decoupled architecture, CLI functionality, idempotent operations, and content storage options. The tool includes a scraper component for efficient scraping, ad blocking, link detection, markdown extraction, dynamic content loading, and anonymity features. It also offers an indexer component for creating AI search indexes, chunking content, embedding chunks, and enabling semantic search. The tool supports various configurations for Azure services and local storage, providing flexibility and scalability for web scraping and indexing tasks.

open-deep-research
Open Deep Research is an open-source tool designed to generate AI-powered reports from web search results efficiently. It combines Bing Search API for search results retrieval, JinaAI for content extraction, and customizable report generation. Users can customize settings, export reports in multiple formats, and benefit from rate limiting for stability. The tool aims to streamline research and report creation in a user-friendly platform.

DevDocs
DevDocs is a platform designed to simplify the process of digesting technical documentation for software engineers and developers. It automates the extraction and conversion of web content into markdown format, making it easier for users to access and understand the information. By crawling through child pages of a given URL, DevDocs provides a streamlined approach to gathering relevant data and integrating it into various tools for software development. The tool aims to save time and effort by eliminating the need for manual research and content extraction, ultimately enhancing productivity and efficiency in the development process.

MiniSearch
MiniSearch is a minimalist search engine with integrated browser-based AI. It is privacy-focused, easy to use, cross-platform, integrated, time-saving, efficient, optimized, and open-source. MiniSearch can be used for a variety of tasks, including searching the web, finding files on your computer, and getting answers to questions. It is a great tool for anyone who wants a fast, private, and easy-to-use search engine.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.