
local-deep-research
Local Deep Research achieves ~95% on SimpleQA benchmark (tested with GPT-4.1-mini). Supports local and cloud LLMs (Ollama, Google, Anthropic, ...). Searches 10+ sources - arXiv, PubMed, web, and your private documents. Everything Local.
Stars: 3388

Local Deep Research is a powerful AI-powered research assistant that performs deep, iterative analysis using multiple LLMs and web searches. It can be run locally for privacy or configured to use cloud-based LLMs for enhanced capabilities. The tool offers advanced research capabilities, flexible LLM support, rich output options, privacy-focused operation, enhanced search integration, and academic & scientific integration. It also provides a web interface, command line interface, and supports multiple LLM providers and search engines. Users can configure AI models, search engines, and research parameters for customized research experiences.
README:
AI-powered research assistant for deep, iterative research
Performs deep, iterative research using multiple LLMs and search engines with proper citations
LDR is an AI research assistant that performs systematic research by:
- Breaking down complex questions into focused sub-queries
- Searching multiple sources in parallel (web, academic papers, local documents)
- Verifying information across sources for accuracy
- Creating comprehensive reports with proper citations
It aims to help researchers, students, and professionals find accurate information quickly while maintaining transparency about sources.
- Privacy-Focused: Run entirely locally with Ollama + SearXNG
- Flexible: Use any LLM, any search engine, any vector store
- Comprehensive: Multiple research modes from quick summaries to detailed reports
- Transparent: Track costs and performance with built-in analytics
- Open Source: MIT licensed with an active community
- Signal-level encryption: SQLCipher with AES-256 (same technology used by Signal messenger) protects all user data at rest
- Per-user isolated databases: Each user has their own encrypted database - complete data isolation
- Zero-knowledge architecture: No password recovery mechanism ensures true privacy
- Advanced key derivation: PBKDF2-SHA512 with 256,000 iterations prevents brute-force attacks
- Data integrity: HMAC-SHA512 verification prevents tampering
~95% accuracy on SimpleQA benchmark (preliminary results)
- Tested with GPT-4.1-mini + SearXNG + focused-iteration strategy
- Comparable to state-of-the-art AI research systems
- Local models can achieve similar performance with proper configuration
- Join our community benchmarking effort β
- Quick Summary - Get answers in 30 seconds to 3 minutes with citations
- Detailed Research - Comprehensive analysis with structured findings
- Report Generation - Professional reports with sections and table of contents
- Document Analysis - Search your private documents with AI
- LangChain Integration - Use any vector store as a search engine
- REST API - Authenticated HTTP access with per-user databases
- Benchmarking - Test and optimize your configuration
- Analytics Dashboard - Track costs, performance, and usage metrics
- Real-time Updates - WebSocket support for live research progress
- Export Options - Download results as PDF or Markdown
- Research History - Save, search, and revisit past research
- Adaptive Rate Limiting - Intelligent retry system that learns optimal wait times
- Keyboard Shortcuts - Navigate efficiently (ESC, Ctrl+Shift+1-5)
- Per-User Encrypted Databases - Secure, isolated data storage for each user
- Automated Research Digests - Subscribe to topics and receive AI-powered research summaries
- Customizable Frequency - Daily, weekly, or custom schedules for research updates
- Smart Filtering - AI filters and summarizes only the most relevant developments
- Multi-format Delivery - Get updates as markdown reports or structured summaries
- Topic & Query Support - Track specific searches or broad research areas
- Academic: arXiv, PubMed, Semantic Scholar
- General: Wikipedia, SearXNG
- Technical: GitHub, Elasticsearch
- Historical: Wayback Machine
- News: The Guardian
- Tavily - AI-powered search
- Google - Via SerpAPI or Programmable Search Engine
- Brave Search - Privacy-focused web search
- Local Documents - Search your files with AI
- LangChain Retrievers - Any vector store or database
- Meta Search - Combine multiple engines intelligently
# Step 1: Pull and run SearXNG for optimal search results
docker run -d -p 8080:8080 --name searxng searxng/searxng
# Step 2: Pull and run Local Deep Research (Please build your own docker on ARM)
docker run -d -p 5000:5000 --network host --name local-deep-research --volume 'deep-research:/data' -e LDR_DATA_DIR=/data localdeepresearch/local-deep-research
LDR uses Docker compose to bundle the web app and all it's dependencies so you can get up and running quickly.
Linux:
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && docker compose up -d
Windows:
curl.exe -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml; docker compose up -d
Use with a different model:
MODEL=gemma:1b curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && docker compose up -d
Open http://localhost:5000 after ~30 seconds. This starts LDR with SearXNG and all dependencies.
See docker-compose.yml for a docker-compose file with reasonable defaults to get up and running with ollama, searxng, and local deep research all running locally.
Things you may want/need to configure:
- Ollama GPU driver
- Ollama context length (depends on available VRAM)
- Ollama keep alive (duration model will stay loaded into VRAM and idle before getting unloaded automatically)
- Deep Research model (depends on available VRAM and preference)
- Docker
- Docker Compose
-
cookiecutter
: Runpip install --user cookiecutter
Clone the repository:
git clone https://github.com/LearningCircuit/local-deep-research.git
cd local-deep-research
Cookiecutter will interactively guide you through the process of creating a
docker-compose
configuration that meets your specific needs. This is the
recommended approach if you are not very familiar with Docker.
In the LDR repository, run the following command to generate the compose file:
cookiecutter cookiecutter-docker/
docker compose -f docker-compose.default.yml up
# Step 1: Install the package
pip install local-deep-research
# Step 2: Setup SearXNG for best results
docker pull searxng/searxng
docker run -d -p 8080:8080 --name searxng searxng/searxng
# Step 3: Install Ollama from https://ollama.ai
# Step 4: Download a model
ollama pull gemma3:12b
# Step 5: Build frontend assets (required for Web UI)
# Note: If installed via pip and using the Web UI, you need to build assets
# Navigate to the installation directory first (find with: pip show local-deep-research)
npm install
npm run build
# Step 6: Start the web interface
python -m local_deep_research.web.app
Important for pip users: If you installed via pip and want to use the web UI, you must run npm install
and npm run build
in the package installation directory to generate frontend assets (icons, styles). Without this, the UI will have missing icons and styling issues. For programmatic API usage only, these steps can be skipped.
from local_deep_research.api import LDRClient, quick_query
# Option 1: Simplest - one line research
summary = quick_query("username", "password", "What is quantum computing?")
print(summary)
# Option 2: Client for multiple operations
client = LDRClient()
client.login("username", "password")
result = client.quick_research("What are the latest advances in quantum computing?")
print(result["summary"])
import requests
# Create session and authenticate
session = requests.Session()
# First, get the login page to retrieve CSRF token for login
login_page = session.get("http://localhost:5000/auth/login")
# Extract CSRF token from HTML form (use BeautifulSoup or regex)
# For simplicity, assuming you have the token
# Login with form data (not JSON) including CSRF token
session.post("http://localhost:5000/auth/login",
data={"username": "user", "password": "pass", "csrf_token": "..."})
# Get CSRF token for API requests
csrf = session.get("http://localhost:5000/auth/csrf-token").json()["csrf_token"]
# Make API request with CSRF header
response = session.post(
"http://localhost:5000/research/api/start",
json={"query": "Explain CRISPR gene editing"},
headers={"X-CSRF-Token": csrf}
)
# Run benchmarks from CLI
python -m local_deep_research.benchmarks --dataset simpleqa --examples 50
# Manage rate limiting
python -m local_deep_research.web_search_engines.rate_limiting status
python -m local_deep_research.web_search_engines.rate_limiting reset
Connect LDR to your existing knowledge base:
from local_deep_research.api import quick_summary
# Use your existing LangChain retriever
result = quick_summary(
query="What are our deployment procedures?",
retrievers={"company_kb": your_retriever},
search_tool="company_kb"
)
Works with: FAISS, Chroma, Pinecone, Weaviate, Elasticsearch, and any LangChain-compatible retriever.
Early experiments on small SimpleQA dataset samples:
Configuration | Accuracy | Notes |
---|---|---|
gpt-4.1-mini + SearXNG + focused_iteration | 90-95% | Limited sample size |
gpt-4.1-mini + Tavily + focused_iteration | 90-95% | Limited sample size |
gemini-2.0-flash-001 + SearXNG | 82% | Single test run |
Note: These are preliminary results from initial testing. Performance varies significantly based on query types, model versions, and configurations. Run your own benchmarks β
Track costs, performance, and usage with detailed metrics. Learn more β
- Llama 3, Mistral, Gemma, DeepSeek
- LLM processing stays local (search queries still go to web)
- No API costs
- OpenAI (GPT-4, GPT-3.5)
- Anthropic (Claude 3)
- Google (Gemini)
- 100+ models via OpenRouter
- Discord - Get help and share research techniques
- Reddit - Updates and showcases
- GitHub Issues - Bug reports
We welcome contributions! See our Contributing Guide to get started.
MIT License - see LICENSE file.
Built with: LangChain, Ollama, SearXNG, FAISS
Support Free Knowledge: Consider donating to Wikipedia, arXiv, or PubMed.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for local-deep-research
Similar Open Source Tools

local-deep-research
Local Deep Research is a powerful AI-powered research assistant that performs deep, iterative analysis using multiple LLMs and web searches. It can be run locally for privacy or configured to use cloud-based LLMs for enhanced capabilities. The tool offers advanced research capabilities, flexible LLM support, rich output options, privacy-focused operation, enhanced search integration, and academic & scientific integration. It also provides a web interface, command line interface, and supports multiple LLM providers and search engines. Users can configure AI models, search engines, and research parameters for customized research experiences.

robustmq
RobustMQ is a next-generation, high-performance, multi-protocol message queue built in Rust. It aims to create a unified messaging infrastructure tailored for modern cloud-native and AI systems. With features like high performance, distributed architecture, multi-protocol support, pluggable storage, cloud-native readiness, multi-tenancy, security features, observability, and user-friendliness, RobustMQ is designed to be production-ready and become a top-level Apache project in the message queue ecosystem by the second half of 2025.

R2R
R2R (RAG to Riches) is a fast and efficient framework for serving high-quality Retrieval-Augmented Generation (RAG) to end users. The framework is designed with customizable pipelines and a feature-rich FastAPI implementation, enabling developers to quickly deploy and scale RAG-based applications. R2R was conceived to bridge the gap between local LLM experimentation and scalable production solutions. **R2R is to LangChain/LlamaIndex what NextJS is to React**. A JavaScript client for R2R deployments can be found here. ### Key Features * **π Deploy** : Instantly launch production-ready RAG pipelines with streaming capabilities. * **π§© Customize** : Tailor your pipeline with intuitive configuration files. * **π Extend** : Enhance your pipeline with custom code integrations. * **βοΈ Autoscale** : Scale your pipeline effortlessly in the cloud using SciPhi. * **π€ OSS** : Benefit from a framework developed by the open-source community, designed to simplify RAG deployment.

neuropilot
NeuroPilot is an open-source AI-powered education platform that transforms study materials into interactive learning resources. It provides tools like contextual chat, smart notes, flashcards, quizzes, and AI podcasts. Supported by various AI models and embedding providers, it offers features like WebSocket streaming, JSON or vector database support, file-based storage, and configurable multi-provider setup for LLMs and TTS engines. The technology stack includes Node.js, TypeScript, Vite, React, TailwindCSS, JSON database, multiple LLM providers, and Docker for deployment. Users can contribute to the project by integrating AI models, adding mobile app support, improving performance, enhancing accessibility features, and creating documentation and tutorials.

claude-007-agents
Claude Code Agents is an open-source AI agent system designed to enhance development workflows by providing specialized AI agents for orchestration, resilience engineering, and organizational memory. These agents offer specialized expertise across technologies, AI system with organizational memory, and an agent orchestration system. The system includes features such as engineering excellence by design, advanced orchestration system, Task Master integration, live MCP integrations, professional-grade workflows, and organizational intelligence. It is suitable for solo developers, small teams, enterprise teams, and open-source projects. The system requires a one-time bootstrap setup for each project to analyze the tech stack, select optimal agents, create configuration files, set up Task Master integration, and validate system readiness.

bifrost
Bifrost is a high-performance AI gateway that unifies access to multiple providers through a single OpenAI-compatible API. It offers features like automatic failover, load balancing, semantic caching, and enterprise-grade functionalities. Users can deploy Bifrost in seconds with zero configuration, benefiting from its core infrastructure, advanced features, enterprise and security capabilities, and developer experience. The repository structure is modular, allowing for maximum flexibility. Bifrost is designed for quick setup, easy configuration, and seamless integration with various AI models and tools.

llamafarm
LlamaFarm is a comprehensive AI framework that empowers users to build powerful AI applications locally, with full control over costs and deployment options. It provides modular components for RAG systems, vector databases, model management, prompt engineering, and fine-tuning. Users can create differentiated AI products without needing extensive ML expertise, using simple CLI commands and YAML configs. The framework supports local-first development, production-ready components, strategy-based configuration, and deployment anywhere from laptops to the cloud.

meeting-minutes
An open-source AI assistant for taking meeting notes that captures live meeting audio, transcribes it in real-time, and generates summaries while ensuring user privacy. Perfect for teams to focus on discussions while automatically capturing and organizing meeting content without external servers or complex infrastructure. Features include modern UI, real-time audio capture, speaker diarization, local processing for privacy, and more. The tool also offers a Rust-based implementation for better performance and native integration, with features like live transcription, speaker diarization, and a rich text editor for notes. Future plans include database connection for saving meeting minutes, improving summarization quality, and adding download options for meeting transcriptions and summaries. The backend supports multiple LLM providers through a unified interface, with configurations for Anthropic, Groq, and Ollama models. System architecture includes core components like audio capture service, transcription engine, LLM orchestrator, data services, and API layer. Prerequisites for setup include Node.js, Python, FFmpeg, and Rust. Development guidelines emphasize project structure, testing, documentation, type hints, and ESLint configuration. Contributions are welcome under the MIT License.

enferno
Enferno is a modern Flask framework optimized for AI-assisted development workflows. It combines carefully crafted development patterns, smart Cursor Rules, and modern libraries to enable developers to build sophisticated web applications with unprecedented speed. Enferno's intelligent patterns and contextual guides help create production-ready SAAS applications faster than ever. It includes features like modern stack, authentication, OAuth integration, database support, task queue, frontend components, security measures, Docker readiness, and more.

finite-monkey-engine
FiniteMonkey is an advanced vulnerability mining engine powered purely by GPT, requiring no prior knowledge base or fine-tuning. Its effectiveness significantly surpasses most current related research approaches. The tool is task-driven, prompt-driven, and focuses on prompt design, leveraging 'deception' and hallucination as key mechanics. It has helped identify vulnerabilities worth over $60,000 in bounties. The tool requires PostgreSQL database, OpenAI API access, and Python environment for setup. It supports various languages like Solidity, Rust, Python, Move, Cairo, Tact, Func, Java, and Fake Solidity for scanning. FiniteMonkey is best suited for logic vulnerability mining in real projects, not recommended for academic vulnerability testing. GPT-4-turbo is recommended for optimal results with an average scan time of 2-3 hours for medium projects. The tool provides detailed scanning results guide and implementation tips for users.

kweaver
KWeaver is an open-source cognitive intelligence development framework that provides data scientists, application developers, and domain experts with the ability for rapid development, comprehensive openness, and high-performance knowledge network generation and cognitive intelligence large model framework. It offers features such as automated and visual knowledge graph construction, visualization and analysis of knowledge graph data, knowledge graph integration, knowledge graph resource management, large model prompt engineering and debugging, and visual configuration for large model access.

Awesome-Lists
Awesome-Lists is a curated list of awesome lists across various domains of computer science and beyond, including programming languages, web development, data science, and more. It provides a comprehensive index of articles, books, courses, open source projects, and other resources. The lists are organized by topic and subtopic, making it easy to find the information you need. Awesome-Lists is a valuable resource for anyone looking to learn more about a particular topic or to stay up-to-date on the latest developments in the field.

lyraios
LYRAIOS (LLM-based Your Reliable AI Operating System) is an advanced AI assistant platform built with FastAPI and Streamlit, designed to serve as an operating system for AI applications. It offers core features such as AI process management, memory system, and I/O system. The platform includes built-in tools like Calculator, Web Search, Financial Analysis, File Management, and Research Tools. It also provides specialized assistant teams for Python and research tasks. LYRAIOS is built on a technical architecture comprising FastAPI backend, Streamlit frontend, Vector Database, PostgreSQL storage, and Docker support. It offers features like knowledge management, process control, and security & access control. The roadmap includes enhancements in core platform, AI process management, memory system, tools & integrations, security & access control, open protocol architecture, multi-agent collaboration, and cross-platform support.

ComfyUI-fal-API
ComfyUI-fal-API is a repository containing custom nodes for using Flux models with fal API in ComfyUI. It provides nodes for image generation, video generation, language models, and vision language models. Users can easily install and configure the repository to access various nodes for different tasks such as generating images, creating videos, processing text, and understanding images. The repository also includes troubleshooting steps and is licensed under the Apache License 2.0.

bytebot
Bytebot is an open-source AI desktop agent that provides a virtual employee with its own computer to complete tasks for users. It can use various applications, download and organize files, log into websites, process documents, and perform complex multi-step workflows. By giving AI access to a complete desktop environment, Bytebot unlocks capabilities not possible with browser-only agents or API integrations, enabling complete task autonomy, document processing, and usage of real applications.

AIPex
AIPex is a revolutionary Chrome extension that transforms your browser into an intelligent automation platform. Using natural language commands and AI-powered intelligence, AIPex can automate virtually any browser task - from complex multi-step workflows to simple repetitive actions. It offers features like natural language control, AI-powered intelligence, multi-step automation, universal compatibility, smart data extraction, precision actions, form automation, visual understanding, developer-friendly with extensive API, and lightning-fast execution of automation tasks.
For similar tasks

Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customerβs subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.

sorrentum
Sorrentum is an open-source project that aims to combine open-source development, startups, and brilliant students to build machine learning, AI, and Web3 / DeFi protocols geared towards finance and economics. The project provides opportunities for internships, research assistantships, and development grants, as well as the chance to work on cutting-edge problems, learn about startups, write academic papers, and get internships and full-time positions at companies working on Sorrentum applications.

tidb
TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.

zep-python
Zep is an open-source platform for building and deploying large language model (LLM) applications. It provides a suite of tools and services that make it easy to integrate LLMs into your applications, including chat history memory, embedding, vector search, and data enrichment. Zep is designed to be scalable, reliable, and easy to use, making it a great choice for developers who want to build LLM-powered applications quickly and easily.

telemetry-airflow
This repository codifies the Airflow cluster that is deployed at workflow.telemetry.mozilla.org (behind SSO) and commonly referred to as "WTMO" or simply "Airflow". Some links relevant to users and developers of WTMO: * The `dags` directory in this repository contains some custom DAG definitions * Many of the DAGs registered with WTMO don't live in this repository, but are instead generated from ETL task definitions in bigquery-etl * The Data SRE team maintains a WTMO Developer Guide (behind SSO)

mojo
Mojo is a new programming language that bridges the gap between research and production by combining Python syntax and ecosystem with systems programming and metaprogramming features. Mojo is still young, but it is designed to become a superset of Python over time.

pandas-ai
PandasAI is a Python library that makes it easy to ask questions to your data in natural language. It helps you to explore, clean, and analyze your data using generative AI.

databend
Databend is an open-source cloud data warehouse that serves as a cost-effective alternative to Snowflake. With its focus on fast query execution and data ingestion, it's designed for complex analysis of the world's largest datasets.
For similar jobs

SLR-FC
This repository provides a comprehensive collection of AI tools and resources to enhance literature reviews. It includes a curated list of AI tools for various tasks, such as identifying research gaps, discovering relevant papers, visualizing paper content, and summarizing text. Additionally, the repository offers materials on generative AI, effective prompts, copywriting, image creation, and showcases of AI capabilities. By leveraging these tools and resources, researchers can streamline their literature review process, gain deeper insights from scholarly literature, and improve the quality of their research outputs.

paper-ai
Paper-ai is a tool that helps you write papers using artificial intelligence. It provides features such as AI writing assistance, reference searching, and editing and formatting tools. With Paper-ai, you can quickly and easily create high-quality papers.

paper-qa
PaperQA is a minimal package for question and answering from PDFs or text files, providing very good answers with in-text citations. It uses OpenAI Embeddings to embed and search documents, and follows a process of embedding docs and queries, searching for top passages, creating summaries, scoring and selecting relevant summaries, putting summaries into prompt, and generating answers. Users can customize prompts and use various models for embeddings and LLMs. The tool can be used asynchronously and supports adding documents from paths, files, or URLs.

ChatData
ChatData is a robust chat-with-documents application designed to extract information and provide answers by querying the MyScale free knowledge base or uploaded documents. It leverages the Retrieval Augmented Generation (RAG) framework, millions of Wikipedia pages, and arXiv papers. Features include self-querying retriever, VectorSQL, session management, and building a personalized knowledge base. Users can effortlessly navigate vast data, explore academic papers, and research documents. ChatData empowers researchers, students, and knowledge enthusiasts to unlock the true potential of information retrieval.

noScribe
noScribe is an AI-based software designed for automated audio transcription, specifically tailored for transcribing interviews for qualitative social research or journalistic purposes. It is a free and open-source tool that runs locally on the user's computer, ensuring data privacy. The software can differentiate between speakers and supports transcription in 99 languages. It includes a user-friendly editor for reviewing and correcting transcripts. Developed by Kai DrΓΆge, a PhD in sociology with a background in computer science, noScribe aims to streamline the transcription process and enhance the efficiency of qualitative analysis.

AIStudyAssistant
AI Study Assistant is an app designed to enhance learning experience and boost academic performance. It serves as a personal tutor, lecture summarizer, writer, and question generator powered by Google PaLM 2. Features include interacting with an AI chatbot, summarizing lectures, generating essays, and creating practice questions. The app is built using 100% Kotlin, Jetpack Compose, Clean Architecture, and MVVM design pattern, with technologies like Ktor, Room DB, Hilt, and Kotlin coroutines. AI Study Assistant aims to provide comprehensive AI-powered assistance for students in various academic tasks.

data-to-paper
Data-to-paper is an AI-driven framework designed to guide users through the process of conducting end-to-end scientific research, starting from raw data to the creation of comprehensive and human-verifiable research papers. The framework leverages a combination of LLM and rule-based agents to assist in tasks such as hypothesis generation, literature search, data analysis, result interpretation, and paper writing. It aims to accelerate research while maintaining key scientific values like transparency, traceability, and verifiability. The framework is field-agnostic, supports both open-goal and fixed-goal research, creates data-chained manuscripts, involves human-in-the-loop interaction, and allows for transparent replay of the research process.

k2
K2 (GeoLLaMA) is a large language model for geoscience, trained on geoscience literature and fine-tuned with knowledge-intensive instruction data. It outperforms baseline models on objective and subjective tasks. The repository provides K2 weights, core data of GeoSignal, GeoBench benchmark, and code for further pretraining and instruction tuning. The model is available on Hugging Face for use. The project aims to create larger and more powerful geoscience language models in the future.