shandu

shandu

OpenAI DeepResearch alternative, An AI-driven research system that performs comprehensive, iterative research on any topic using multiple search engines and LLMs.

Stars: 426

Visit
 screenshot

Shandu is an advanced AI research system that automates comprehensive research processes using language models, web scraping, and iterative exploration to generate well-structured reports with citations. It features intelligent state-based workflow, deep exploration, multi-source information synthesis, enhanced web scraping, smart source evaluation, content analysis pipeline, comprehensive report generation, parallel processing, adaptive search strategy, and full citation management.

README:

Shandu: Advanced AI Research System for Comprehensive Knowledge Synthesis

Shandu is a cutting-edge AI research assistant that performs in-depth, multi-source research on any topic using advanced language models, intelligent web scraping, and iterative exploration to generate comprehensive, well-structured reports with proper citations.

MIT License Python 3.9+

🔍 What is Shandu?

Shandu is an intelligent, LLM-powered research system that automates the comprehensive research process - from initial query clarification to in-depth content analysis and report generation. Built on LangGraph's state-based workflow, it recursively explores topics with sophisticated algorithms for source evaluation, content extraction, and knowledge synthesis.

Key Use Cases

  • Academic Research: Generate literature reviews, background information, and complex topic analyses
  • Market Intelligence: Analyze industry trends, competitor strategies, and market opportunities
  • Content Creation: Produce well-researched articles, blog posts, and reports with proper citations
  • Technology Exploration: Track emerging technologies, innovations, and technical developments
  • Policy Analysis: Research regulations, compliance requirements, and policy implications
  • Competitive Analysis: Compare products, services, and company strategies across industries

⚙️ How Shandu Works

flowchart TB
    subgraph Input
        Q[User Query]
        B[Breadth Parameter]
        D[Depth Parameter]
    end

    DR[Deep Research] -->
    SQ[SERP Queries] -->
    PR[Process Results]

    subgraph Results[Results]
        direction TB
        NL((Learnings))
        ND((Directions))
    end

    PR --> NL
    PR --> ND

    DP{depth > 0?}

    RD["Next Direction:
    - Prior Goals
    - New Questions
    - Learnings"]

    MR[Markdown Report]

    %% Main Flow
    Q & B & D --> DR

    %% Results to Decision
    NL & ND --> DP

    %% Circular Flow
    DP -->|Yes| RD
    RD -->|New Context| DR

    %% Final Output
    DP -->|No| MR

    %% Styling
    classDef input fill:#7bed9f,stroke:#2ed573,color:black
    classDef process fill:#70a1ff,stroke:#1e90ff,color:black
    classDef recursive fill:#ffa502,stroke:#ff7f50,color:black
    classDef output fill:#ff4757,stroke:#ff6b81,color:black
    classDef results fill:#a8e6cf,stroke:#3b7a57,color:black

    class Q,B,D input
    class DR,SQ,PR process
    class DP,RD recursive
    class MR output
    class NL,ND results

🌟 Key Features

  • Intelligent State-based Workflow: Leverages LangGraph for a structured, step-by-step research process with clear state transitions
  • Iterative Deep Exploration: Recursively explores topics with dynamic depth and breadth, adapting to information discovered
  • Multi-source Information Synthesis: Analyzes data from search engines, web content, and structured knowledge bases
  • Enhanced Web Scraping: Features dynamic JS rendering, content extraction, and ethical scraping practices
  • Smart Source Evaluation: Automatically assesses source credibility, relevance, and information value
  • Content Analysis Pipeline: Uses advanced NLP to extract key information, identify patterns, and synthesize findings
  • Comprehensive Report Generation: Creates detailed, well-structured reports with proper citations and thematic organization
  • Parallel Processing Architecture: Implements concurrent operations for efficient multi-query execution
  • Adaptive Search Strategy: Dynamically adjusts search queries based on discovered information and knowledge gaps
  • Full Citation Management: Properly attributes all sources with formatted citations in multiple styles

🚀 Installation

# Install from PyPI
pip install shandu

# Install from source
git clone https://github.com/jolovicdev/shandu.git
cd shandu
pip install -e .

🏁 Quick Start

# Configure API settings (supports various LLM providers)
shandu configure

# Run comprehensive research
shandu research "Your research query" --depth 2 --breadth 4 --output report.md

# Quick AI-powered search with web scraping (You dont need Perplexity!)
shandu aisearch "Who is the current sitting president of United States?" --detailed

# Basic multi-engine search
shandu search "Your search query"

📚 Detailed Usage

Research Command

shandu research "Your research query" \
    --depth 3 \                # How deep to explore (1-5, default: 2)
    --breadth 5 \              # How many parallel queries (2-10, default: 4)
    --output report.md \       # Save to file instead of terminal
    --verbose                  # Show detailed progress

Example Reports

You can find example reports in the examples directory:

  1. The Intersection of Quantum Computing, Synthetic Biology, and Climate Modeling
    shandu research "The Intersection of Quantum Computing, Synthetic Biology, and Climate Modeling" --depth 3 --breadth 3 --output examples/o3-mini-high.md

AI Search Command with Scraping

shandu aisearch "Your search query" \
    --engines "google,duckduckgo" \  # Comma-separated list of search engines
    --max-results 15 \               # Maximum number of results to return
    --output results.md \            # Save to file instead of terminal
    --detailed                       # Generate a detailed analysis

💻 Python API

from shandu.agents import ResearchGraph
from langchain_openai import ChatOpenAI

# Initialize with custom LLM if desired
llm = ChatOpenAI(model="gpt-4")

# Initialize the research graph
researcher = ResearchGraph(
    llm=llm,
    temperature=0.5
)

# Perform deep research
results = researcher.research_sync(
    query="Your research query",
    depth=3,       # How deep to go with recursive research
    breadth=4,     # How many parallel queries to explore
    detail_level="high"
)

# Print or save results
print(results.to_markdown())

🧩 Advanced Architecture

Shandu implements a sophisticated multi-stage research pipeline:

  1. Query Clarification: Interactive questions to understand research needs
  2. Research Planning: Strategic planning for comprehensive topic coverage
  3. Iterative Exploration:
    • Smart query generation based on knowledge gaps
    • Multi-engine search with parallelized execution
    • Relevance filtering of search results
    • Intelligent web scraping with content extraction
    • Source credibility assessment
    • Information analysis and synthesis
    • Reflection on findings to identify gaps
  4. Report Generation:
    • Theme extraction and organization
    • Multi-step report enhancement
    • Citation formatting and management
    • Section expansion for comprehensive coverage

🔌 Supported Search Engines & Sources

  • Google Search
  • DuckDuckGo
  • Wikipedia
  • ArXiv (academic papers)
  • Custom search engines can be added

📊 Technical Capabilities

  • Dynamic JS Rendering: Handles JavaScript-heavy websites
  • Content Extraction: Identifies and extracts main content from web pages
  • Parallel Processing: Concurrent execution of searches and scraping
  • Caching: Efficient caching of search results and scraped content
  • Rate Limiting: Respectful access to web resources
  • Robots.txt Compliance: Ethical web scraping practices
  • Flexible Output Formats: Markdown, JSON, plain text

📜 License & INFO

Note: this is just an idea, i need contributors, this is project where I'm taking deep dive into LangChain and LangGraph! This project is licensed under the MIT License - see the LICENSE file for details.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for shandu

Similar Open Source Tools

For similar tasks

For similar jobs