youtu-graphrag

youtu-graphrag

Official repository of Youtu-GraphRAG: Vertically Unified Agents for Graph Retrieval-Augmented Complex Reasoning

Stars: 441

Visit
 screenshot

Youtu-GraphRAG is a vertically unified agentic paradigm that connects the entire framework based on graph schema, allowing seamless domain transfer with minimal intervention. It introduces key innovations like schema-guided hierarchical knowledge tree construction, dually-perceived community detection, agentic retrieval, advanced construction and reasoning capabilities, fair anonymous dataset 'AnonyRAG', and unified configuration management. The framework demonstrates robustness with lower token cost and higher accuracy compared to state-of-the-art methods, enabling enterprise-scale deployment with minimal manual intervention for new domains.

README:

Youtu-agent Logo Youtu-GraphRAG:
Vertically Unified Agents for Graph Retrieval-Augmented Complex Reasoning

License: MIT Paper WeChat Community Discord Community

๐Ÿš€ Revolutionary framework moving Pareto Frontier with 33.6% lower token cost and 16.62% higher accuracy over SOTA baselines

๐Ÿ”– ไธญๆ–‡็‰ˆ โ€ข โญ Contributions โ€ข ๐Ÿ“Š Benchmarks โ€ข ๐Ÿš€ Getting Started

๐ŸŽฏ Brief Introduction

Youtu-GraphRAG is a vertically unified agentic paradigm that jointly connects the entire framework as an intricate integration based on graph schema. We allow seamless domain transfer with minimal intervention on the graph schema, providing insights of the next evolutionary GraphRAG paradigm for real-world applications with remarkable adaptability.

Youtu-GrapHRAG Logo

๐ŸŽจ When and Why to use Youtu-GraphRAG

๐Ÿ”— Multi-hop Reasoning/Summarization/Conclusion: Complex questions requiring multi-step reasoning
๐Ÿ“š Knowledge-Intensive Tasks: Questions dependent on large amounts of structured/private/domain knowledge
๐ŸŒ Domain Scalability: Easily support encyclopedias, academic papers, commercial/private knowledge base and other domains with minimal intervention on the schema

๐Ÿ—๏ธ Framework Architecture

Youtu-GraphRAG Framework Architecture
A sketched overview of our proposed framework Youtu-GraphRAG.

๐Ÿ“ฒ Interactive interface

Graph Construction Retrieval

๐Ÿš€ Contributions and Novelty

Based on our unified agentic paradigm for Graph Retrieval-Augmented Generation (GraphRAG), Youtu-GraphRAG introduces several key innovations that jointly connect the entire framework as an intricate integration:

๐Ÿ—๏ธ 1. Schema-Guided Hierarchical Knowledge Tree Construction

  • ๐ŸŒฑ Seed Graph Schema: Introduces targeted entity types, relations, and attribute types to bound automatic extraction agents
  • ๐Ÿ“ˆ Scalable Schema Expansion: Continuously expands schemas for adaptability over unseen domains
  • ๐Ÿข Four-Level Architecture:
    • Level 1 (Attributes): Entity property information
    • Level 2 (Relations): Entity relationship triples
    • Level 3 (Keywords): Keyword indexing
    • Level 4 (Communities): Hierarchical community structure
  • โšก Quick Adaptation to industrial applications: We allow seamless domain transfer with minimal intervention on the schema

๐ŸŒณ 2. Dually-Perceived Community Detection

  • ๐Ÿ”ฌ Novel Community Detection Algorithm: Fuses structural topology with subgraph semantics for comprehensive knowledge organization
  • ๐Ÿ“Š Hierarchical Knowledge Tree: Naturally yields a structure supporting both top-down filtering and bottom-up reasoning that performs better than traditional Leiden and Louvain algorithms
  • ๐Ÿ“ Community Summaries: LLM-enhanced community summarization for higher-level knowledge abstraction
Youtu-GraphRAG Community Detection

๐Ÿค– 3. Agentic Retrieval

  • ๐ŸŽฏ Schema-Aware Decomposition: Interprets the same graph schema to transform complex queries into tractable and parallel sub-queries
  • ๐Ÿ”„ Iterative Reflection: Performs reflection for more advanced reasoning through IRCoT (Iterative Retrieval Chain of Thought)
Youtu-GraphRAG Agentic Decomposer

๐Ÿง  4. Advanced Construction and Reasoning Capabilities for real-world deployment

  • ๐ŸŽฏ Performance Enhancement: Less token costs and higher accuracy with optimized prompting, indexing and retrieval strategies
  • ๐Ÿคนโ€โ™€๏ธ User friendly visualization: In output/graphs/, the four-level knowledge tree supports visualization with neo4j import๏ผŒmaking reasoning paths and knowledge organization vividly visable to users
  • โšก Parallel Sub-question Processing: Concurrent handling of decomposed questions for efficiency and complex scenarios
  • ๐Ÿค” Iterative Reasoning: Step-by-step answer construction with reasoning traces
  • ๐Ÿ“Š Domain Scalability: Designed for enterprise-scale deployment with minimal manual intervention for new domains

๐Ÿ“ˆ 5. Fair Anonymous Dataset 'AnonyRAG'

  • Link: Hugging Face AnonyRAG
  • Against knowledeg leakage in LLM/embedding model pretraining
  • In-depth test on real retrieval performance of GraphRAG
  • Multi-lingual with Chinese and English versions

โš™๏ธ 6. Unified Configuration Management

  • ๐ŸŽ›๏ธ Centralized Parameter Management: All components configured through a single YAML file
  • ๐Ÿ”ง Runtime Parameter Override: Dynamic configuration adjustment during execution
  • ๐ŸŒ Multi-Environment Support: Seamless domain transfer with minimal intervention on schema
  • ๐Ÿ”„ Backward Compatibility: Ensures existing code continues to function

๐Ÿ“Š Performance Comparisons

Extensive experiments across six challenging benchmarks, including GraphRAG-Bench, HotpotQA and MuSiQue, demonstrate the robustness of Youtu-GraphRAG, remarkably moving the Pareto frontier with 33.6% lower token cost compared to the sota methods and 16.62% higher accuracy over state-of-the-art baselines. The results indicate our adaptability, allowing seamless domain transfer with minimal intervention on schema.

Cost/acc performance Moving Pareto Frontier radar comparison

๐Ÿ“ Project Structure

youtu-graphrag/
โ”œโ”€โ”€ ๐Ÿ“ config/                     # Configuration System
โ”‚   โ”œโ”€โ”€ base_config.yaml           # Main configuration file
โ”‚   โ”œโ”€โ”€ config_loader.py           # Configuration loader
โ”‚   โ””โ”€โ”€ __init__.py                # Configuration module interface
โ”‚
โ”œโ”€โ”€ ๐Ÿ“ data/                       # Data Directory
โ”‚
โ”œโ”€โ”€ ๐Ÿ“ models/                     # Core Models
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ constructor/            # Knowledge Graph Construction
โ”‚   โ”‚   โ””โ”€โ”€ kt_gen.py              # KTBuilder - Hierarchical graph builder
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ retriever/              # Retrieval Module
โ”‚   โ”‚   โ”œโ”€โ”€ enhanced_kt_retriever.py  # KTRetriever - Main retriever
โ”‚   โ”‚   โ”œโ”€โ”€ agentic_decomposer.py     # Query decomposer
โ”‚   โ””โ”€โ”€ โ””โ”€โ”€ faiss_filter.py           # DualFAISSRetriever - FAISS retrieval
โ”‚
โ”œโ”€โ”€ ๐Ÿ“ utils/                      # Utility Modules
โ”‚   โ”œโ”€โ”€ tree_comm.py              # community detection algorithm
โ”‚   โ”œโ”€โ”€ call_llm_api.py           # LLM API calling
โ”‚   โ”œโ”€โ”€ eval.py                   # Evaluation tools
โ”‚   โ””โ”€โ”€ graph_processor.py        # Graph processing tools
โ”‚
โ”œโ”€โ”€ ๐Ÿ“ schemas/                   # Dataset Schemas
โ”œโ”€โ”€ ๐Ÿ“ assets/                    # Assets (images, figures)
โ”‚
โ”œโ”€โ”€ ๐Ÿ“ output/                    # Output Directory
โ”‚   โ”œโ”€โ”€ graphs/                   # Constructed knowledge graphs
โ”‚   โ”œโ”€โ”€ chunks/                   # Text chunk information
โ”‚   โ””โ”€โ”€ logs/                     # Runtime logs
โ”‚
โ”œโ”€โ”€ ๐Ÿ“ retriever/                 # Retrieval Cache
โ”‚
โ”œโ”€โ”€ main.py                       # ๐ŸŽฏ Main program entry
โ”œโ”€โ”€ requirements.txt              # Dependencies list
โ”œโ”€โ”€ setup_env.sh                  # install web dependency
โ”œโ”€โ”€ start.sh                      # start web service
โ””โ”€โ”€ README.md                     # Project documentation

๐Ÿš€ Quick Start

We provide two approaches to run and experience the demo service. Considering the differences in the underlying environment, we recommend using Docker as the preferred deployment method.

๐Ÿ’ป Start with Dockerfile

This approach relies on the Docker environment, which could be installed according to official documentation.

# 1. Clone Youtu-GraphRAG project
git clone https://github.com/TencentCloudADP/youtu-graphrag

# 2. Create .env according to .env.example
cd youtu-graphrag && cp .env.example .env
# Config your LLM api in .env as OpenAI API format
# LLM_MODEL=deepseek-chat
# LLM_BASE_URL=https://api.deepseek.com
# LLM_API_KEY=sk-xxxxxx

# 3. Build with dockerfile 
docker build -t youtu_graphrag:v1 .

# 4. Docker run
docker run -d -p 8000:8000 youtu_graphrag:v1

# 5. Visit http://localhost:8000
curl -v http://localhost:8000

๐Ÿ’ป Web UI Experience

This approach relies on Python 3.10 and the corresponding pip environment, you can install it according to the official documentation.

# 1. Clone Youtu-GraphRAG project
git clone https://github.com/TencentCloudADP/youtu-graphrag

# 2. Create .env according to .env.example
cd youtu-graphrag && cp .env.example .env
# Config your LLM api in .env as OpenAI API format
# LLM_MODEL=deepseek-chat
# LLM_BASE_URL=https://api.deepseek.com
# LLM_API_KEY=sk-xxxxxx

# 3. Setup environment
./setup_env.sh

# 4. Launch the web
./start.sh

# 5. Visit http://localhost:8000
curl -v http://localhost:8000

๐Ÿ“– Full Usage Guide

For advanced config and usage๏ผš๐Ÿš€ FullGuide

โญ Start using Youtu-GraphRAG now and experience the intelligent question answering! ๐Ÿš€

๐Ÿค Contributing

We welcome contributions from the community! Here's how you can help:

๐Ÿ’ป Code Contribution

  1. ๐Ÿด Fork the project
  2. ๐ŸŒฟ Create a feature branch (git checkout -b feature/AmazingFeature)
  3. ๐Ÿ’พ Commit your changes (git commit -m 'Add some AmazingFeature')
  4. ๐Ÿ“ค Push to the branch (git push origin feature/AmazingFeature)
  5. ๐Ÿ”„ Create a Pull Request

๐Ÿ”ง Extension Guide

  • ๐ŸŒฑ New Seed Schemas: Add high-quality seed schema and data processing
  • ๐Ÿ“Š Custom Datasets: Integrate new datasets with minimal schema intervention
  • ๐ŸŽฏ Domain-Specific Applications: Extend framework for specialized use cases with 'Best Practice'

๐Ÿ“ž Contact

Hanson Dong - [email protected] Siyu An - [email protected]


๐ŸŽ‰ Citation

@misc{dong2025youtugraphrag,
      title={Youtu-GraphRAG: Vertically Unified Agents for Graph Retrieval-Augmented Complex Reasoning}, 
      author={Junnan Dong and Siyu An and Yifei Yu and Qian-Wen Zhang and Linhao Luo and Xiao Huang and Yunsheng Wu and Di Yin and Xing Sun},
      year={2025},
      eprint={2508.19855},
      archivePrefix={arXiv},
      url={https://arxiv.org/abs/2508.19855}, 
}

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for youtu-graphrag

Similar Open Source Tools

For similar tasks

For similar jobs