NadirClaw

Open-source LLM router that saves you money. Routes simple prompts to cheap/local models, complex ones to premium — automatically. OpenAI-compatible proxy.

Stars: 99

Visit

NadirClaw is a powerful open-source tool designed for web scraping and data extraction. It provides a user-friendly interface for extracting data from websites with ease. With NadirClaw, users can easily scrape text, images, and other content from web pages for various purposes such as data analysis, research, and automation. The tool offers flexibility and customization options to cater to different scraping needs, making it a versatile solution for extracting data from the web. Whether you are a data scientist, researcher, or developer, NadirClaw can streamline your data extraction process and help you gather valuable insights from online sources.

README:

NadirClaw

Open-source LLM router that saves you money. Simple prompts go to cheap/local models, complex prompts go to premium models -- automatically.

NadirClaw sits between your AI tool and your LLM providers as an OpenAI-compatible proxy. It classifies every prompt in ~10ms and routes it to the right model. Works with any tool that speaks the OpenAI API: OpenClaw, Codex, Claude Code, Continue, Cursor, or plain curl.

How does NadirClaw compare to OpenRouter? See NadirClaw vs OpenRouter.

Quick Start

pip install nadirclaw

Or install from source:

curl -fsSL https://raw.githubusercontent.com/doramirdor/NadirClaw/main/install.sh | sh

Then run the interactive setup wizard:

nadirclaw setup

This guides you through selecting providers, entering API keys, and choosing models for each routing tier. Then start the router:

nadirclaw serve --verbose

That's it. NadirClaw starts on http://localhost:8856 with sensible defaults (Gemini 3 Flash for simple, OpenAI Codex for complex). If you skip nadirclaw setup, the serve command will offer to run it on first launch.

Features

Smart routing — classifies prompts in ~10ms using sentence embeddings
Agentic task detection — auto-detects tool use, multi-step loops, and agent system prompts; forces complex model for agentic requests
Reasoning detection — identifies prompts needing chain-of-thought and routes to reasoning-optimized models
Routing profiles — auto, eco, premium, free, reasoning — choose your cost/quality strategy per request
Model aliases — use short names like sonnet, flash, gpt4 instead of full model IDs
Session persistence — pins the model for multi-turn conversations so you don't bounce between models mid-thread
Context-window filtering — auto-swaps to a model with a larger context window when your conversation is too long
Rate limit fallback — if the primary model is rate-limited (429), automatically falls back to the other tier's model instead of failing
Streaming support — full SSE streaming compatible with OpenClaw, Codex, and other streaming clients
Native Gemini support — calls Gemini models directly via the Google GenAI SDK (not through LiteLLM)
OAuth login — use your subscription with nadirclaw auth <provider> login (OpenAI, Anthropic, Google), no API key needed
Multi-provider — supports Gemini, OpenAI, Anthropic, Ollama, and any LiteLLM-supported provider
OpenAI-compatible API — drop-in replacement for any tool that speaks the OpenAI chat completions API
Request reporting — nadirclaw report analyzes your JSONL logs with filters, latency stats, tier breakdown, and token usage
Raw logging — optional --log-raw flag to capture full request/response content for debugging and replay
OpenTelemetry tracing — optional distributed tracing with GenAI semantic conventions (pip install nadirclaw[telemetry])

Prerequisites

Python 3.10+
git
At least one LLM provider:
- Google Gemini API key (free tier: 20 req/day)
- Ollama running locally (free, no API key needed)
- Anthropic API key for Claude models
- OpenAI API key for GPT models
- Provider subscriptions via OAuth (nadirclaw auth openai login, nadirclaw auth anthropic login, nadirclaw auth antigravity login, nadirclaw auth gemini login)
- Or any provider supported by LiteLLM

Install

One-line install (recommended)

curl -fsSL https://raw.githubusercontent.com/doramirdor/NadirClaw/main/install.sh | sh

This clones the repo to ~/.nadirclaw, creates a virtual environment, installs dependencies, and adds nadirclaw to your PATH. Run it again to update.

Manual install

git clone https://github.com/doramirdor/NadirClaw.git
cd NadirClaw
python3 -m venv venv
source venv/bin/activate
pip install -e .

Uninstall

rm -rf ~/.nadirclaw
sudo rm -f /usr/local/bin/nadirclaw

Configure

Environment File

NadirClaw loads configuration from ~/.nadirclaw/.env. Create or edit this file to set API keys and model preferences:

# ~/.nadirclaw/.env

# API keys (set the ones you use)
GEMINI_API_KEY=AIza...
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

# Model routing
NADIRCLAW_SIMPLE_MODEL=gemini-3-flash-preview
NADIRCLAW_COMPLEX_MODEL=gemini-2.5-pro

# Server
NADIRCLAW_PORT=8856

If ~/.nadirclaw/.env does not exist, NadirClaw falls back to .env in the current directory.

Authentication

NadirClaw supports multiple ways to provide LLM credentials, checked in this order:

OpenClaw stored token (~/.openclaw/agents/main/agent/auth-profiles.json)
NadirClaw stored credential (~/.nadirclaw/credentials.json)
Environment variable (GEMINI_API_KEY, ANTHROPIC_API_KEY, OPENAI_API_KEY, etc.)

Using `nadirclaw auth` (recommended)

# Add a Gemini API key
nadirclaw auth add --provider google --key AIza...

# Add any provider API key
nadirclaw auth add --provider anthropic --key sk-ant-...
nadirclaw auth add --provider openai --key sk-...

# Login with your OpenAI/ChatGPT subscription (OAuth, no API key needed)
nadirclaw auth openai login

# Login with your Anthropic/Claude subscription (OAuth, no API key needed)
nadirclaw auth anthropic login

# Login with Google Gemini (OAuth, opens browser)
nadirclaw auth gemini login

# Login with Google Antigravity (OAuth, opens browser)
nadirclaw auth antigravity login

# Store a Claude subscription token (from 'claude setup-token') - alternative to OAuth
nadirclaw auth setup-token

# Check what's configured
nadirclaw auth status

# Remove a credential
nadirclaw auth remove google

Using environment variables

Set API keys in ~/.nadirclaw/.env:

GEMINI_API_KEY=AIza...          # or GOOGLE_API_KEY
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...

Model Configuration

Configure which model handles each tier:

NADIRCLAW_SIMPLE_MODEL=gemini-3-flash-preview          # cheap/free model
NADIRCLAW_COMPLEX_MODEL=gemini-2.5-pro                 # premium model
NADIRCLAW_REASONING_MODEL=o3                           # reasoning tasks (optional, defaults to complex)
NADIRCLAW_FREE_MODEL=ollama/llama3.1:8b                # free fallback (optional, defaults to simple)

Example Setups

Setup	Simple Model	Complex Model	API Keys Needed
Gemini + Gemini	`gemini-2.5-flash`	`gemini-2.5-pro`	`GEMINI_API_KEY`
Gemini + Claude	`gemini-2.5-flash`	`claude-sonnet-4-5-20250929`	`GEMINI_API_KEY` + `ANTHROPIC_API_KEY`
Claude + Ollama	`ollama/llama3.1:8b`	`claude-sonnet-4-5-20250929`	`ANTHROPIC_API_KEY`
Claude + Claude	`claude-haiku-4-5-20251001`	`claude-sonnet-4-5-20250929`	`ANTHROPIC_API_KEY`
OpenAI + Ollama	`ollama/llama3.1:8b`	`gpt-4.1`	`OPENAI_API_KEY`
OpenAI + OpenAI	`gpt-4.1-mini`	`gpt-4.1`	`OPENAI_API_KEY`
OpenAI Codex	`gemini-2.5-flash`	`openai-codex/gpt-5.3-codex`	`GEMINI_API_KEY` + OAuth login
Fully local	`ollama/llama3.1:8b`	`ollama/qwen3:32b`	None

Gemini models are called natively via the Google GenAI SDK. All other models go through LiteLLM, which supports 100+ providers.

Usage with Gemini

Gemini is the default simple model. NadirClaw calls Gemini natively via the Google GenAI SDK for best performance.

# Set your Gemini API key
nadirclaw auth add --provider google --key AIza...

# Or set in ~/.nadirclaw/.env
echo "GEMINI_API_KEY=AIza..." >> ~/.nadirclaw/.env

# Start the router
nadirclaw serve --verbose

Rate Limit Fallback

If the primary model hits a 429 rate limit, NadirClaw automatically retries once, then falls back to the other tier's model. For example, if gemini-3-flash-preview is exhausted, NadirClaw will try gemini-2.5-pro (or whatever your complex model is). If both models are rate-limited, it returns a friendly error message instead of crashing.

Usage with Ollama

If you're running Ollama locally, NadirClaw works out of the box with no API keys:

# Fully local setup -- no API keys, no cost
NADIRCLAW_SIMPLE_MODEL=ollama/llama3.1:8b \
NADIRCLAW_COMPLEX_MODEL=ollama/qwen3:32b \
nadirclaw serve --verbose

Or mix local + cloud:

nadirclaw serve \
  --simple-model ollama/llama3.1:8b \
  --complex-model claude-sonnet-4-20250514 \
  --verbose

Recommended Ollama Models

Model	Size	Good For
`llama3.1:8b`	4.7 GB	Simple tier (fast, good enough)
`qwen3:32b`	19 GB	Complex tier (local, no API cost)
`qwen3-coder`	19 GB	Code-heavy complex tier
`deepseek-r1:14b`	9 GB	Reasoning-heavy complex tier

Usage with OpenClaw

OpenClaw is a personal AI assistant that bridges messaging services to AI coding agents. NadirClaw integrates as a model provider so OpenClaw's requests are automatically routed to the right model.

Quick Setup

# Auto-configure OpenClaw to use NadirClaw
nadirclaw openclaw onboard

# Start the router
nadirclaw serve

This writes NadirClaw as a provider in ~/.openclaw/openclaw.json with model nadirclaw/auto. If OpenClaw is already running, it will auto-reload the config -- no restart needed.

Configure Only (Without Launching)

nadirclaw openclaw onboard
# Then start NadirClaw separately when ready:
nadirclaw serve

What It Does

nadirclaw openclaw onboard adds this to your OpenClaw config:

{
  "models": {
    "providers": {
      "nadirclaw": {
        "baseUrl": "http://localhost:8856/v1",
        "apiKey": "local",
        "api": "openai-completions",
        "models": [{ "id": "auto", "name": "auto" }]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": { "primary": "nadirclaw/auto" }
    }
  }
}

NadirClaw supports the SSE streaming format that OpenClaw expects (stream: true), handling multi-modal content and tool definitions in system prompts.

Usage with Codex

Codex is OpenAI's CLI coding agent. NadirClaw integrates as a custom model provider.

# Auto-configure Codex
nadirclaw codex onboard

# Start the router
nadirclaw serve

This writes ~/.codex/config.toml:

model_provider = "nadirclaw"

[model_providers.nadirclaw]
base_url = "http://localhost:8856/v1"
api_key = "local"

OpenAI Subscription (OAuth)

To use your ChatGPT subscription instead of an API key:

# Login with your OpenAI account (opens browser)
nadirclaw auth openai login

# NadirClaw will auto-refresh the token when it expires

This delegates to the Codex CLI for the OAuth flow and stores the credentials in ~/.nadirclaw/credentials.json. Tokens are automatically refreshed when they expire.

Usage with Any OpenAI-Compatible Tool

NadirClaw exposes a standard OpenAI-compatible API. Point any tool at it:

# Base URL
http://localhost:8856/v1

# Model
model: "auto"    # or omit -- NadirClaw picks the best model

Example: curl

curl http://localhost:8856/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "What is 2+2?"}]
  }'

Example: curl (streaming)

curl http://localhost:8856/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "What is 2+2?"}],
    "stream": true
  }'

Example: Python (openai SDK)

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8856/v1",
    api_key="local",  # NadirClaw doesn't require auth by default
)

response = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "What is 2+2?"}],
)
print(response.choices[0].message.content)

Routing Profiles

Choose your routing strategy by setting the model field:

Profile	Model Field	Strategy	Use Case
auto	`auto` or omit	Smart routing (default)	Best overall balance
eco	`eco`	Always use simple model	Maximum savings
premium	`premium`	Always use complex model	Best quality
free	`free`	Use free fallback model	Zero cost
reasoning	`reasoning`	Use reasoning model	Chain-of-thought tasks

# Use profiles via the model field
curl http://localhost:8856/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "eco", "messages": [{"role": "user", "content": "Hello"}]}'

# Also works with nadirclaw/ prefix
# model: "nadirclaw/eco", "nadirclaw/premium", etc.

Model Aliases

Use short names instead of full model IDs:

Alias	Resolves To
`sonnet`	`claude-sonnet-4-5-20250929`
`opus`	`claude-opus-4-6-20250918`
`haiku`	`claude-haiku-4-5-20251001`
`gpt4`	`gpt-4.1`
`gpt5`	`gpt-5.2`
`flash`	`gemini-2.5-flash`
`gemini-pro`	`gemini-2.5-pro`
`deepseek`	`deepseek/deepseek-chat`
`deepseek-r1`	`deepseek/deepseek-reasoner`
`llama`	`ollama/llama3.1:8b`

# Use an alias as the model
curl http://localhost:8856/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "sonnet", "messages": [{"role": "user", "content": "Hello"}]}'

Routing Intelligence

Beyond basic simple/complex classification, NadirClaw applies routing modifiers that can override the base decision:

Agentic Task Detection

NadirClaw detects agentic requests (coding agents, multi-step tool use) and forces them to the complex model, even if the individual message looks simple. Signals:

Tool definitions in the request (tools array)
Tool-role messages (active tool execution loop)
Assistant→tool→assistant cycles (multi-step execution)
Agent-like system prompts ("you are a coding agent", "you can execute commands")
Long system prompts (>500 chars, typical of agent instructions)
Deep conversations (>10 messages)

This prevents a message like "now add tests" from being routed to the cheap model when it's part of an ongoing agentic refactoring session.

Reasoning Detection

Prompts with 2+ reasoning markers are routed to the reasoning model (or complex model if no reasoning model is configured):

"step by step", "think through", "chain of thought"
"prove that", "derive the", "mathematically show"
"analyze the tradeoffs", "compare and contrast"
"critically analyze", "evaluate whether"

Session Persistence

Once a conversation is routed to a model, subsequent messages in the same session reuse that model. This prevents jarring mid-conversation model switches. Sessions are keyed by system prompt + first user message, with a 30-minute TTL.

Context Window Filtering

If the estimated token count of a request exceeds a model's context window, NadirClaw automatically swaps to a model with a larger context. For example, a 150k-token conversation targeting gpt-4o (128k context) will be redirected to gemini-2.5-pro (1M context).

CLI Reference

nadirclaw setup              # Interactive setup wizard (providers, keys, models)
nadirclaw serve              # Start the router server
nadirclaw serve --log-raw    # Start with full request/response logging
nadirclaw classify           # Classify a prompt (no server needed)
nadirclaw report             # Show a summary report of request logs
nadirclaw report --since 24h # Report for the last 24 hours
nadirclaw status             # Show config, credentials, and server status
nadirclaw auth add           # Add an API key for any provider
nadirclaw auth status        # Show configured credentials (masked)
nadirclaw auth remove        # Remove a stored credential
nadirclaw auth setup-token      # Store a Claude subscription token (alternative to OAuth)
nadirclaw auth openai login     # Login with OpenAI subscription (OAuth)
nadirclaw auth openai logout    # Remove stored OpenAI OAuth credential
nadirclaw auth anthropic login     # Login with Anthropic/Claude subscription (OAuth)
nadirclaw auth anthropic logout    # Remove stored Anthropic OAuth credential
nadirclaw auth antigravity login   # Login with Google Antigravity (OAuth, opens browser)
nadirclaw auth antigravity logout  # Remove stored Antigravity OAuth credential
nadirclaw auth gemini login       # Login with Google Gemini (OAuth, opens browser)
nadirclaw auth gemini logout      # Remove stored Gemini OAuth credential
nadirclaw codex onboard         # Configure Codex integration
nadirclaw openclaw onboard   # Configure OpenClaw integration
nadirclaw build-centroids    # Regenerate centroid vectors from prototypes

`nadirclaw serve`

nadirclaw serve [OPTIONS]

Options:
  --port INTEGER          Port to listen on (default: 8856)
  --simple-model TEXT     Model for simple prompts
  --complex-model TEXT    Model for complex prompts
  --models TEXT           Comma-separated model list (legacy)
  --token TEXT            Auth token
  --verbose               Enable debug logging
  --log-raw               Log full raw requests and responses to JSONL

`nadirclaw report`

Analyze request logs and print a summary report:

nadirclaw report                     # full report
nadirclaw report --since 24h         # last 24 hours
nadirclaw report --since 7d          # last 7 days
nadirclaw report --since 2025-02-01  # since a specific date
nadirclaw report --model gemini      # filter by model name
nadirclaw report --format json       # machine-readable JSON output
nadirclaw report --export report.txt # save to file

Example output:

NadirClaw Report
==================================================
Total requests: 147
From: 2026-02-14T08:12:03+00:00
To:   2026-02-14T22:47:19+00:00

Requests by Type
------------------------------
  classify                    12
  completion                 135

Tier Distribution
------------------------------
  complex                    41  (31.1%)
  direct                      8  (6.1%)
  simple                     83  (62.9%)

Model Usage
------------------------------------------------------------
  Model                               Reqs      Tokens
  gemini-3-flash-preview                83       48210
  openai-codex/gpt-5.3-codex           41      127840
  claude-sonnet-4-20250514               8       31500

Latency (ms)
----------------------------------------
  classifier       avg=12  p50=11  p95=24
  total             avg=847  p50=620  p95=2340

Token Usage
------------------------------
  Prompt:         138420
  Completion:      69130
  Total:          207550

  Fallbacks: 3
  Errors: 2
  Streaming requests: 47
  Requests with tools: 18 (54 tools total)

`nadirclaw classify`

Classify a prompt locally without running the server. Useful for testing your setup:

$ nadirclaw classify "What is 2+2?"
Tier:       simple
Confidence: 0.2848
Score:      0.0000
Model:      gemini-3-flash-preview

$ nadirclaw classify "Design a distributed system for real-time trading"
Tier:       complex
Confidence: 0.1843
Score:      1.0000
Model:      gemini-2.5-pro

`nadirclaw status`

$ nadirclaw status
NadirClaw Status
----------------------------------------
Simple model:  gemini-3-flash-preview
Complex model: gemini-2.5-pro
Tier config:   explicit (env vars)
Port:          8856
Threshold:     0.06
Log dir:       /Users/you/.nadirclaw/logs
Token:         nadir-***

Server:        RUNNING (ok)

How It Works

Most LLM usage doesn't need a premium model. NadirClaw routes each prompt to the right tier automatically:

NadirClaw uses a binary complexity classifier based on sentence embeddings:

Pre-computed centroids: Ships two tiny centroid vectors (~1.5 KB each) derived from ~170 seed prompts. These are pre-computed and included in the package — no training step required.
Classification: For each incoming prompt, computes its embedding using all-MiniLM-L6-v2 (~80 MB, downloaded once on first use) and measures cosine similarity to both centroids. If the prompt is closer to the complex centroid, it routes to your complex model; otherwise to your simple model.
Borderline handling: When confidence is below the threshold (default 0.06), the classifier defaults to complex -- it's cheaper to over-serve a simple prompt than to under-serve a complex one.
Routing modifiers: After classification, NadirClaw applies intelligent overrides:
- Agentic detection — if tool definitions, tool-role messages, or agent system prompts are detected, forces the complex model
- Reasoning detection — if 2+ reasoning markers are found, routes to the reasoning model
- Context window check — if the conversation exceeds the model's context window, swaps to a model that fits
- Session persistence — reuses the same model for follow-up messages in the same conversation
Dispatch: Calls the selected model via the appropriate backend:
- Gemini models — called natively via the Google GenAI SDK for best performance
- All other models — called via LiteLLM, which provides a unified interface to 100+ providers
Rate limit fallback: If the selected model returns a 429 rate limit error, NadirClaw retries once, then automatically falls back to the other tier's model. If both are rate-limited, it returns a user-friendly error message.

Classification takes ~10ms on a warm encoder. The first request takes ~2-3 seconds to load the embedding model.

API Endpoints

Auth is disabled by default (local-only). Set NADIRCLAW_AUTH_TOKEN to require a bearer token.

Endpoint	Method	Description
`/v1/chat/completions`	POST	OpenAI-compatible completions with auto routing (supports `stream: true`)
`/v1/classify`	POST	Classify a prompt without calling an LLM
`/v1/classify/batch`	POST	Classify multiple prompts at once
`/v1/models`	GET	List available models
`/v1/logs`	GET	View recent request logs
`/health`	GET	Health check (no auth required)

Configuration Reference

Variable	Default	Description
`NADIRCLAW_SIMPLE_MODEL`	`gemini-3-flash-preview`	Model for simple prompts
`NADIRCLAW_COMPLEX_MODEL`	`openai-codex/gpt-5.3-codex`	Model for complex prompts
`NADIRCLAW_REASONING_MODEL`	(falls back to complex)	Model for reasoning tasks
`NADIRCLAW_FREE_MODEL`	(falls back to simple)	Free fallback model
`NADIRCLAW_AUTH_TOKEN`	(empty — auth disabled)	Set to require a bearer token
`GEMINI_API_KEY`	--	Google Gemini API key (also accepts `GOOGLE_API_KEY`)
`ANTHROPIC_API_KEY`	--	Anthropic API key
`OPENAI_API_KEY`	--	OpenAI API key
`OLLAMA_API_BASE`	`http://localhost:11434`	Ollama base URL
`NADIRCLAW_CONFIDENCE_THRESHOLD`	`0.06`	Classification threshold (lower = more complex)
`NADIRCLAW_PORT`	`8856`	Server port
`NADIRCLAW_LOG_DIR`	`~/.nadirclaw/logs`	Log directory
`NADIRCLAW_LOG_RAW`	`false`	Log full raw requests and responses (`true`/`false`)
`NADIRCLAW_MODELS`	`openai-codex/gpt-5.3-codex,gemini-3-flash-preview`	Legacy model list (fallback if tier vars not set)
`OTEL_EXPORTER_OTLP_ENDPOINT`	(empty — disabled)	OpenTelemetry collector endpoint (enables tracing)

OpenTelemetry (Optional)

NadirClaw supports optional distributed tracing via OpenTelemetry. Install the extras and set an OTLP endpoint:

pip install nadirclaw[telemetry]

# Export to a local collector (e.g. Jaeger, Grafana Tempo)
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 nadirclaw serve

When enabled, NadirClaw emits spans for:

smart_route_analysis — classifier decision with tier and selected model
dispatch_model — individual LLM provider call
chat_completion — full request lifecycle

Spans include GenAI semantic conventions (gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens) plus custom nadirclaw.* attributes for routing metadata.

If the telemetry packages are not installed or OTEL_EXPORTER_OTLP_ENDPOINT is not set, all tracing is a no-op with zero overhead.

Project Structure

nadirclaw/
  __init__.py        # Package version
  cli.py             # CLI commands (setup, serve, classify, report, status, auth, codex, openclaw)
  setup.py           # Interactive setup wizard (provider selection, credentials, model config)
  server.py          # FastAPI server with OpenAI-compatible API + streaming
  classifier.py      # Binary complexity classifier (sentence embeddings)
  credentials.py     # Credential storage, resolution chain, and OAuth token refresh
  encoder.py         # Shared SentenceTransformer singleton
  oauth.py           # OAuth login flows (OpenAI, Anthropic, Gemini, Antigravity)
  routing.py         # Routing intelligence (agentic, reasoning, profiles, aliases, sessions)
  report.py          # Log parsing and report generation
  telemetry.py       # Optional OpenTelemetry integration (no-op without packages)
  auth.py            # Bearer token / API key authentication
  settings.py        # Environment-based configuration (reads ~/.nadirclaw/.env)
  prototypes.py      # Seed prompts for centroid generation
  simple_centroid.npy   # Pre-computed simple centroid vector
  complex_centroid.npy  # Pre-computed complex centroid vector

License

MIT

For Tasks:

Click tags to check more tools for each tasks

extract data analyze trends automate processes gather insights research competitors

For Jobs:

data analyst researcher web developer digital marketer business intelligence analyst

Alternative AI tools for NadirClaw

Similar Open Source Tools

NadirClaw

github

: 99

Aimer_WT

Aimer_WT is a web scraping tool designed to extract data from websites efficiently and accurately. It provides a user-friendly interface for users to specify the data they want to scrape and offers various customization options. With Aimer_WT, users can easily automate the process of collecting data from multiple web pages, saving time and effort. The tool is suitable for both beginners and experienced users who need to gather data for research, analysis, or other purposes. Aimer_WT supports various data formats and allows users to export the extracted data for further processing.

github

: 53

waidrin

Waidrin is a powerful web scraping tool that allows users to easily extract data from websites. It provides a user-friendly interface for creating custom web scraping scripts and supports various data formats for exporting the extracted data. With Waidrin, users can automate the process of collecting information from multiple websites, saving time and effort. The tool is designed to be flexible and scalable, making it suitable for both beginners and advanced users in the field of web scraping.

github

: 229

onlook

Onlook is a web scraping tool that allows users to extract data from websites easily and efficiently. It provides a user-friendly interface for creating web scraping scripts and supports various data formats for exporting the extracted data. With Onlook, users can automate the process of collecting information from multiple websites, saving time and effort. The tool is designed to be flexible and customizable, making it suitable for a wide range of web scraping tasks.

github

: 22.4k

HyperAgent

HyperAgent is a powerful tool for automating repetitive tasks in web scraping and data extraction. It provides a user-friendly interface to create custom web scraping scripts without the need for extensive coding knowledge. With HyperAgent, users can easily extract data from websites, transform it into structured formats, and save it for further analysis. The tool supports various data formats and offers scheduling options for automated data extraction at regular intervals. HyperAgent is suitable for individuals and businesses looking to streamline their data collection processes and improve efficiency in extracting information from the web.

github

: 1.0k

Website-Crawler

Website-Crawler is a tool designed to extract data from websites in an automated manner. It allows users to scrape information such as text, images, links, and more from web pages. The tool provides functionalities to navigate through websites, handle different types of content, and store extracted data for further analysis. Website-Crawler is useful for tasks like web scraping, data collection, content aggregation, and competitive analysis. It can be customized to extract specific data elements based on user requirements, making it a versatile tool for various web data extraction needs.

github

: 61

firecrawl

Firecrawl is an API service that empowers AI applications with clean data from any website. It features advanced scraping, crawling, and data extraction capabilities. The repository is still in development, integrating custom modules into the mono repo. Users can run it locally but it's not fully ready for self-hosted deployment yet. Firecrawl offers powerful capabilities like scraping, crawling, mapping, searching, and extracting structured data from single pages, multiple pages, or entire websites with AI. It supports various formats, actions, and batch scraping. The tool is designed to handle proxies, anti-bot mechanisms, dynamic content, media parsing, change tracking, and more. Firecrawl is available as an open-source project under the AGPL-3.0 license, with additional features offered in the cloud version.

github

: 83.9k

context7

Context7 is a powerful tool for analyzing and visualizing data in various formats. It provides a user-friendly interface for exploring datasets, generating insights, and creating interactive visualizations. With advanced features such as data filtering, aggregation, and customization, Context7 is suitable for both beginners and experienced data analysts. The tool supports a wide range of data sources and formats, making it versatile for different use cases. Whether you are working on exploratory data analysis, data visualization, or data storytelling, Context7 can help you uncover valuable insights and communicate your findings effectively.

github

: 46.2k

ROGRAG

ROGRAG is a powerful open-source tool designed for data analysis and visualization. It provides a user-friendly interface for exploring and manipulating datasets, making it ideal for researchers, data scientists, and analysts. With ROGRAG, users can easily import, clean, analyze, and visualize data to gain valuable insights and make informed decisions. The tool supports a wide range of data formats and offers a variety of statistical and visualization tools to help users uncover patterns, trends, and relationships in their data. Whether you are working on exploratory data analysis, statistical modeling, or data visualization, ROGRAG is a versatile tool that can streamline your workflow and enhance your data analysis capabilities.

github

: 172

arconia

Arconia is a powerful open-source tool for managing and visualizing data in a user-friendly way. It provides a seamless experience for data analysts and scientists to explore, clean, and analyze datasets efficiently. With its intuitive interface and robust features, Arconia simplifies the process of data manipulation and visualization, making it an essential tool for anyone working with data.

github

: 63

CrossIntelligence

CrossIntelligence is a powerful tool for data analysis and visualization. It allows users to easily connect and analyze data from multiple sources, providing valuable insights and trends. With a user-friendly interface and customizable features, CrossIntelligence is suitable for both beginners and advanced users in various industries such as marketing, finance, and research.

github

: 67

atlas

Atlas is a powerful data visualization tool that allows users to create interactive charts and graphs from their datasets. It provides a user-friendly interface for exploring and analyzing data, making it ideal for both beginners and experienced data analysts. With Atlas, users can easily customize the appearance of their visualizations, add filters and drill-down capabilities, and share their insights with others. The tool supports a wide range of data formats and offers various chart types to suit different data visualization needs. Whether you are looking to create simple bar charts or complex interactive dashboards, Atlas has you covered.

github

: 108

vizra-adk

Vizra-ADK is a data visualization tool that allows users to create interactive and customizable visualizations for their data. With a user-friendly interface and a wide range of customization options, Vizra-ADK makes it easy for users to explore and analyze their data in a visually appealing way. Whether you're a data scientist looking to create informative charts and graphs, or a business analyst wanting to present your findings in a compelling way, Vizra-ADK has you covered. The tool supports various data formats and provides features like filtering, sorting, and grouping to help users make sense of their data quickly and efficiently.

github

: 181

crawl4ai

Crawl4AI is a powerful and free web crawling service that extracts valuable data from websites and provides LLM-friendly output formats. It supports crawling multiple URLs simultaneously, replaces media tags with ALT, and is completely free to use and open-source. Users can integrate Crawl4AI into Python projects as a library or run it as a standalone local server. The tool allows users to crawl and extract data from specified URLs using different providers and models, with options to include raw HTML content, force fresh crawls, and extract meaningful text blocks. Configuration settings can be adjusted in the `crawler/config.py` file to customize providers, API keys, chunk processing, and word thresholds. Contributions to Crawl4AI are welcome from the open-source community to enhance its value for AI enthusiasts and developers.

github

: 60.4k

ag2

Ag2 is a lightweight and efficient tool for generating automated reports from data sources. It simplifies the process of creating reports by allowing users to define templates and automate the data extraction and formatting. With Ag2, users can easily generate reports in various formats such as PDF, Excel, and CSV, saving time and effort in manual report generation tasks.

github

: 4.1k

llama_index

LlamaIndex is a data framework for building LLM applications. It provides tools for ingesting, structuring, and querying data, as well as integrating with LLMs and other tools. LlamaIndex is designed to be easy to use for both beginner and advanced users, and it provides a comprehensive set of features for building LLM applications.

github

: 47.1k

For similar tasks

Awesome-Segment-Anything

Awesome-Segment-Anything is a powerful tool for segmenting and extracting information from various types of data. It provides a user-friendly interface to easily define segmentation rules and apply them to text, images, and other data formats. The tool supports both supervised and unsupervised segmentation methods, allowing users to customize the segmentation process based on their specific needs. With its versatile functionality and intuitive design, Awesome-Segment-Anything is ideal for data analysts, researchers, content creators, and anyone looking to efficiently extract valuable insights from complex datasets.

github

: 926

Time-LLM

Time-LLM is a reprogramming framework that repurposes large language models (LLMs) for time series forecasting. It allows users to treat time series analysis as a 'language task' and effectively leverage pre-trained LLMs for forecasting. The framework involves reprogramming time series data into text representations and providing declarative prompts to guide the LLM reasoning process. Time-LLM supports various backbone models such as Llama-7B, GPT-2, and BERT, offering flexibility in model selection. The tool provides a general framework for repurposing language models for time series forecasting tasks.

github

: 764

crewAI

CrewAI is a cutting-edge framework designed to orchestrate role-playing autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks. It enables AI agents to assume roles, share goals, and operate in a cohesive unit, much like a well-oiled crew. Whether you're building a smart assistant platform, an automated customer service ensemble, or a multi-agent research team, CrewAI provides the backbone for sophisticated multi-agent interactions. With features like role-based agent design, autonomous inter-agent delegation, flexible task management, and support for various LLMs, CrewAI offers a dynamic and adaptable solution for both development and production workflows.

github

: 44.3k

Transformers_And_LLM_Are_What_You_Dont_Need

Transformers_And_LLM_Are_What_You_Dont_Need is a repository that explores the limitations of transformers in time series forecasting. It contains a collection of papers, articles, and theses discussing the effectiveness of transformers and LLMs in this domain. The repository aims to provide insights into why transformers may not be the best choice for time series forecasting tasks.

github

: 644

pytorch-forecasting

PyTorch Forecasting is a PyTorch-based package for time series forecasting with state-of-the-art network architectures. It offers a high-level API for training networks on pandas data frames and utilizes PyTorch Lightning for scalable training on GPUs and CPUs. The package aims to simplify time series forecasting with neural networks by providing a flexible API for professionals and default settings for beginners. It includes a timeseries dataset class, base model class, multiple neural network architectures, multi-horizon timeseries metrics, and hyperparameter tuning with optuna. PyTorch Forecasting is built on pytorch-lightning for easy training on various hardware configurations.

github

: 3.8k

spider

Spider is a high-performance web crawler and indexer designed to handle data curation workloads efficiently. It offers features such as concurrency, streaming, decentralization, headless Chrome rendering, HTTP proxies, cron jobs, subscriptions, smart mode, blacklisting, whitelisting, budgeting depth, dynamic AI prompt scripting, CSS scraping, and more. Users can easily get started with the Spider Cloud hosted service or set up local installations with spider-cli. The tool supports integration with Node.js and Python for additional flexibility. With a focus on speed and scalability, Spider is ideal for extracting and organizing data from the web.

github

: 946

AI_for_Science_paper_collection

AI for Science paper collection is an initiative by AI for Science Community to collect and categorize papers in AI for Science areas by subjects, years, venues, and keywords. The repository contains `.csv` files with paper lists labeled by keys such as `Title`, `Conference`, `Type`, `Application`, `MLTech`, `OpenReviewLink`. It covers top conferences like ICML, NeurIPS, and ICLR. Volunteers can contribute by updating existing `.csv` files or adding new ones for uncovered conferences/years. The initiative aims to track the increasing trend of AI for Science papers and analyze trends in different applications.

github

: 55

pytorch-forecasting

PyTorch Forecasting is a PyTorch-based package designed for state-of-the-art timeseries forecasting using deep learning architectures. It offers a high-level API and leverages PyTorch Lightning for efficient training on GPU or CPU with automatic logging. The package aims to simplify timeseries forecasting tasks by providing a flexible API for professionals and user-friendly defaults for beginners. It includes features such as a timeseries dataset class for handling data transformations, missing values, and subsampling, various neural network architectures optimized for real-world deployment, multi-horizon timeseries metrics, and hyperparameter tuning with optuna. Built on pytorch-lightning, it supports training on CPUs, single GPUs, and multiple GPUs out-of-the-box.

github

: 4.0k

For similar jobs

databerry

Chaindesk is a no-code platform that allows users to easily set up a semantic search system for personal data without technical knowledge. It supports loading data from various sources such as raw text, web pages, files (Word, Excel, PowerPoint, PDF, Markdown, Plain Text), and upcoming support for web sites, Notion, and Airtable. The platform offers a user-friendly interface for managing datastores, querying data via a secure API endpoint, and auto-generating ChatGPT Plugins for each datastore. Chaindesk utilizes a Vector Database (Qdrant), Openai's text-embedding-ada-002 for embeddings, and has a chunk size of 1024 tokens. The technology stack includes Next.js, Joy UI, LangchainJS, PostgreSQL, Prisma, and Qdrant, inspired by the ChatGPT Retrieval Plugin.

github

: 2.9k

OAD

OAD is a powerful open-source tool for analyzing and visualizing data. It provides a user-friendly interface for exploring datasets, generating insights, and creating interactive visualizations. With OAD, users can easily import data from various sources, clean and preprocess data, perform statistical analysis, and create customizable visualizations to communicate findings effectively. Whether you are a data scientist, analyst, or researcher, OAD can help you streamline your data analysis workflow and uncover valuable insights from your data.

github

: 132

sqlcoder

Defog's SQLCoder is a family of state-of-the-art large language models (LLMs) designed for converting natural language questions into SQL queries. It outperforms popular open-source models like gpt-4 and gpt-4-turbo on SQL generation tasks. SQLCoder has been trained on more than 20,000 human-curated questions based on 10 different schemas, and the model weights are licensed under CC BY-SA 4.0. Users can interact with SQLCoder through the 'transformers' library and run queries using the 'sqlcoder launch' command in the terminal. The tool has been tested on NVIDIA GPUs with more than 16GB VRAM and Apple Silicon devices with some limitations. SQLCoder offers a demo on their website and supports quantized versions of the model for consumer GPUs with sufficient memory.

github

: 2.8k

TableLLM

TableLLM is a large language model designed for efficient tabular data manipulation tasks in real office scenarios. It can generate code solutions or direct text answers for tasks like insert, delete, update, query, merge, and chart operations on tables embedded in spreadsheets or documents. The model has been fine-tuned based on CodeLlama-7B and 13B, offering two scales: TableLLM-7B and TableLLM-13B. Evaluation results show its performance on benchmarks like WikiSQL, Spider, and self-created table operation benchmark. Users can use TableLLM for code and text generation tasks on tabular data.

github

: 77

mlcraft

Synmetrix (prev. MLCraft) is an open source data engineering platform and semantic layer for centralized metrics management. It provides a complete framework for modeling, integrating, transforming, aggregating, and distributing metrics data at scale. Key features include data modeling and transformations, semantic layer for unified data model, scheduled reports and alerts, versioning, role-based access control, data exploration, caching, and collaboration on metrics modeling. Synmetrix leverages Cube (Cube.js) for flexible data models that consolidate metrics from various sources, enabling downstream distribution via a SQL API for integration into BI tools, reporting, dashboards, and data science. Use cases include data democratization, business intelligence, embedded analytics, and enhancing accuracy in data handling and queries. The tool speeds up data-driven workflows from metrics definition to consumption by combining data engineering best practices with self-service analytics capabilities.

github

: 480

data-scientist-roadmap2024

The Data Scientist Roadmap2024 provides a comprehensive guide to mastering essential tools for data science success. It includes programming languages, machine learning libraries, cloud platforms, and concepts categorized by difficulty. The roadmap covers a wide range of topics from programming languages to machine learning techniques, data visualization tools, and DevOps/MLOps tools. It also includes web development frameworks and specific concepts like supervised and unsupervised learning, NLP, deep learning, reinforcement learning, and statistics. Additionally, it delves into DevOps tools like Airflow and MLFlow, data visualization tools like Tableau and Matplotlib, and other topics such as ETL processes, optimization algorithms, and financial modeling.

github

: 254

VMind

VMind is an open-source solution for intelligent visualization, providing an intelligent chart component based on LLM by VisActor. It allows users to create chart narrative works with natural language interaction, edit charts through dialogue, and export narratives as videos or GIFs. The tool is easy to use, scalable, supports various chart types, and offers one-click export functionality. Users can customize chart styles, specify themes, and aggregate data using LLM models. VMind aims to enhance efficiency in creating data visualization works through dialogue-based editing and natural language interaction.

github

: 263

quadratic

Quadratic is a modern multiplayer spreadsheet application that integrates Python, AI, and SQL functionalities. It aims to streamline team collaboration and data analysis by enabling users to pull data from various sources and utilize popular data science tools. The application supports building dashboards, creating internal tools, mixing data from different sources, exploring data for insights, visualizing Python workflows, and facilitating collaboration between technical and non-technical team members. Quadratic is built with Rust + WASM + WebGL to ensure seamless performance in the browser, and it offers features like WebGL Grid, local file management, Python and Pandas support, Excel formula support, multiplayer capabilities, charts and graphs, and team support. The tool is currently in Beta with ongoing development for additional features like JS support, SQL database support, and AI auto-complete.

github

: 3.8k

NadirClaw

README:

NadirClaw

Quick Start

Features

Prerequisites

Install

One-line install (recommended)

Manual install

Uninstall

Configure

Environment File

Authentication

Using nadirclaw auth (recommended)

Using environment variables

Model Configuration

Example Setups

Usage with Gemini

Rate Limit Fallback

Usage with Ollama

Recommended Ollama Models

Usage with OpenClaw

Quick Setup

Configure Only (Without Launching)

What It Does

Usage with Codex

OpenAI Subscription (OAuth)

Usage with Any OpenAI-Compatible Tool

Example: curl

Example: curl (streaming)

Example: Python (openai SDK)

Routing Profiles

Model Aliases

Routing Intelligence

Agentic Task Detection

Reasoning Detection

Session Persistence

Context Window Filtering

CLI Reference

nadirclaw serve

nadirclaw report

nadirclaw classify

nadirclaw status

How It Works

API Endpoints

Configuration Reference

OpenTelemetry (Optional)

Project Structure

License

For Tasks:

For Jobs:

Alternative AI tools for NadirClaw

Similar Open Source Tools

NadirClaw

Aimer_WT

waidrin

onlook

HyperAgent

Website-Crawler

firecrawl

context7

ROGRAG

arconia

CrossIntelligence

atlas

vizra-adk

crawl4ai

ag2

llama_index

For similar tasks

Awesome-Segment-Anything

Time-LLM

crewAI

Transformers_And_LLM_Are_What_You_Dont_Need

pytorch-forecasting

spider

AI_for_Science_paper_collection

pytorch-forecasting

For similar jobs

databerry

Using `nadirclaw auth` (recommended)

`nadirclaw serve`

`nadirclaw report`

`nadirclaw classify`

`nadirclaw status`