nano-graphrag
A simple, easy-to-hack GraphRAG implementation
Stars: 995
nano-GraphRAG is a simple, easy-to-hack implementation of GraphRAG that provides a smaller, faster, and cleaner version of the official implementation. It is about 800 lines of code, small yet scalable, asynchronous, and fully typed. The tool supports incremental insert, async methods, and various parameters for customization. Users can replace storage components and LLM functions as needed. It also allows for embedding function replacement and comes with pre-defined prompts for entity extraction and community reports. However, some features like covariates and global search implementation differ from the original GraphRAG. Future versions aim to address issues related to data source ID, community description truncation, and add new components.
README:
😭 GraphRAG is good and powerful, but the official implementation is difficult/painful to read or hack.
😊 This project provides a smaller, faster, cleaner GraphRAG, while remaining the core functionality(see benchmark and issues ).
🎁 Excluding tests
and prompts, nano-graphrag
is about 1100 lines of code.
👌 Small yet portable(faiss, neo4j, ollama...), asynchronous and fully typed.
Install from source (recommend)
# clone this repo first
cd nano-graphrag
pip install -e .
Install from PyPi
pip install nano-graphrag
[!TIP]
Please set OpenAI API key in environment:
export OPENAI_API_KEY="sk-..."
.
[!TIP] If you're using Azure OpenAI API, refer to the .env.example to set your azure openai. Then pass
GraphRAG(...,using_azure_openai=True,...)
to enable.
[!TIP]
If you don't have any key, check out this example that using
transformers
andollama
. If you like to use another LLM or Embedding Model, check Advances.
download a copy of A Christmas Carol by Charles Dickens:
curl https://raw.githubusercontent.com/gusye1234/nano-graphrag/main/tests/mock_data.txt > ./book.txt
Use the below python snippet:
from nano_graphrag import GraphRAG, QueryParam
graph_func = GraphRAG(working_dir="./dickens")
with open("./book.txt") as f:
graph_func.insert(f.read())
# Perform global graphrag search
print(graph_func.query("What are the top themes in this story?"))
# Perform local graphrag search (I think is better and more scalable one)
print(graph_func.query("What are the top themes in this story?", param=QueryParam(mode="local")))
Next time you initialize a GraphRAG
from the same working_dir
, it will reload all the contexts automatically.
graph_func.insert(["TEXT1", "TEXT2",...])
Incremental Insert
nano-graphrag
supports incremental insert, no duplicated computation or data will be added:
with open("./book.txt") as f:
book = f.read()
half_len = len(book) // 2
graph_func.insert(book[:half_len])
graph_func.insert(book[half_len:])
nano-graphrag
use md5-hash of the content as the key, so there is no duplicated chunk.However, each time you insert, the communities of graph will be re-computed and the community reports will be re-generated
Naive RAG
nano-graphrag
supports naive RAG insert and query as well:
graph_func = GraphRAG(working_dir="./dickens", enable_naive_rag=True)
...
# Query
print(rag.query(
"What are the top themes in this story?",
param=QueryParam(mode="naive")
)
For each method NAME(...)
, there is a corresponding async method aNAME(...)
await graph_func.ainsert(...)
await graph_func.aquery(...)
...
GraphRAG
and QueryParam
are dataclass
in Python. Use help(GraphRAG)
and help(QueryParam)
to see all available parameters! Or check out the Advances section to see some options.
Below are the components you can use:
Type | What | Where |
---|---|---|
LLM | OpenAI | Built-in |
DeepSeek | examples | |
ollama |
examples | |
Embedding | OpenAI | Built-in |
Sentence-transformers | examples | |
Vector DataBase | nano-vectordb |
Built-in |
hnswlib |
Built-in, examples | |
milvus-lite |
examples | |
faiss | examples | |
Graph Storage | networkx |
Built-in |
neo4j |
Built-in(doc) | |
Visualization | graphml | examples |
Chunking | by token size | Built-in |
by text splitter | Built-in |
-
Built-in
means we have that implementation insidenano-graphrag
.examples
means we have that implementation inside an tutorial under examples folder. -
Check examples/benchmarks to see few comparisons between components.
-
Always welcome to contribute more components.
Some setup options
-
GraphRAG(...,always_create_working_dir=False,...)
will skip the dir-creating step. Use it if you switch all your components to non-file storages.
Only query the related context
graph_func.query
return the final answer without streaming.
If you like to interagte nano-graphrag
in your project, you can use param=QueryParam(..., only_need_context=True,...)
, which will only return the retrieved context from graph, something like:
# Local mode
-----Reports-----
```csv
id, content
0, # FOX News and Key Figures in Media and Politics...
1, ...
```
...
# Global mode
----Analyst 3----
Importance Score: 100
Donald J. Trump: Frequently discussed in relation to his political activities...
...
You can integrate that context into your customized prompt.
Prompt
nano-graphrag
use prompts from nano_graphrag.prompt.PROMPTS
dict object. You can play with it and replace any prompt inside.
Some important prompts:
-
PROMPTS["entity_extraction"]
is used to extract the entities and relations from a text chunk. -
PROMPTS["community_report"]
is used to organize and summary the graph cluster's description. -
PROMPTS["local_rag_response"]
is the system prompt template of the local search generation. -
PROMPTS["global_reduce_rag_response"]
is the system prompt template of the global search generation. -
PROMPTS["fail_response"]
is the fallback response when nothing is related to the user query.
Customize Chunking
nano-graphrag
allow you to customize your own chunking method, check out the example.
Switch to the built-in text splitter chunking method:
from nano_graphrag._op import chunking_by_seperators
GraphRAG(...,chunk_func=chunking_by_seperators,...)
LLM Function
In nano-graphrag
, we requires two types of LLM, a great one and a cheap one. The former is used to plan and respond, the latter is used to summary. By default, the great one is gpt-4o
and the cheap one is gpt-4o-mini
You can implement your own LLM function (refer to _llm.gpt_4o_complete
):
async def my_llm_complete(
prompt, system_prompt=None, history_messages=[], **kwargs
) -> str:
# pop cache KV database if any
hashing_kv: BaseKVStorage = kwargs.pop("hashing_kv", None)
# the rest kwargs are for calling LLM, for example, `max_tokens=xxx`
...
# YOUR LLM calling
response = await call_your_LLM(messages, **kwargs)
return response
Replace the default one with:
# Adjust the max token size or the max async requests if needed
GraphRAG(best_model_func=my_llm_complete, best_model_max_token_size=..., best_model_max_async=...)
GraphRAG(cheap_model_func=my_llm_complete, cheap_model_max_token_size=..., cheap_model_max_async=...)
You can refer to this example that use deepseek-chat
as the LLM model
You can refer to this example that use ollama
as the LLM model
nano-graphrag
will use best_model_func
to output JSON with params "response_format": {"type": "json_object"}
. However there are some open-source model maybe produce unstable JSON.
nano-graphrag
introduces a post-process interface for you to convert the response to JSON. This func's signature is below:
def YOUR_STRING_TO_JSON_FUNC(response: str) -> dict:
"Convert the string response to JSON"
...
And pass your own func by GraphRAG(...convert_response_to_json_func=YOUR_STRING_TO_JSON_FUNC,...)
.
For example, you can refer to json_repair to repair the JSON string returned by LLM.
Embedding Function
You can replace the default embedding functions with any _utils.EmbedddingFunc
instance.
For example, the default one is using OpenAI embedding API:
@wrap_embedding_func_with_attrs(embedding_dim=1536, max_token_size=8192)
async def openai_embedding(texts: list[str]) -> np.ndarray:
openai_async_client = AsyncOpenAI()
response = await openai_async_client.embeddings.create(
model="text-embedding-3-small", input=texts, encoding_format="float"
)
return np.array([dp.embedding for dp in response.data])
Replace default embedding function with:
GraphRAG(embedding_func=your_embed_func, embedding_batch_num=..., embedding_func_max_async=...)
You can refer to an example that use sentence-transformer
to locally compute embeddings.
Storage Component
You can replace all storage-related components to your own implementation, nano-graphrag
mainly uses three kinds of storage:
base.BaseKVStorage
for storing key-json pairs of data
- By default we use disk file storage as the backend.
GraphRAG(.., key_string_value_json_storage_cls=YOURS,...)
base.BaseVectorStorage
for indexing embeddings
- By default we use
nano-vectordb
as the backend. - We have a built-in
hnswlib
storage also, check out this example. - Check out this example that implements
milvus-lite
as the backend (not available in Windows). GraphRAG(.., vector_db_storage_cls=YOURS,...)
base.BaseGraphStorage
for storing knowledge graph
- By default we use
networkx
as the backend. - We have a built-in
Neo4jStorage
for graph, check out this tutorial. GraphRAG(.., graph_storage_cls=YOURS,...)
You can refer to nano_graphrag.base
to see detailed interfaces for each components.
Check FQA.
See ROADMAP.md
nano-graphrag
is open to any kind of contribution. Read this before you contribute.
- Medical Graph RAG: Graph RAG for the Medical Data
- LightRAG: Simple and Fast Retrieval-Augmented Generation
Welcome to pull requests if your project uses
nano-graphrag
, it will help others to trust this repo❤️
-
nano-graphrag
didn't implement thecovariates
feature ofGraphRAG
-
nano-graphrag
implements the global search different from the original. The original use a map-reduce-like style to fill all the communities into context, whilenano-graphrag
only use the top-K important and central communites (useQueryParam.global_max_consider_community
to control, default to 512 communities).
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for nano-graphrag
Similar Open Source Tools
nano-graphrag
nano-GraphRAG is a simple, easy-to-hack implementation of GraphRAG that provides a smaller, faster, and cleaner version of the official implementation. It is about 800 lines of code, small yet scalable, asynchronous, and fully typed. The tool supports incremental insert, async methods, and various parameters for customization. Users can replace storage components and LLM functions as needed. It also allows for embedding function replacement and comes with pre-defined prompts for entity extraction and community reports. However, some features like covariates and global search implementation differ from the original GraphRAG. Future versions aim to address issues related to data source ID, community description truncation, and add new components.
LeanCopilot
Lean Copilot is a tool that enables the use of large language models (LLMs) in Lean for proof automation. It provides features such as suggesting tactics/premises, searching for proofs, and running inference of LLMs. Users can utilize built-in models from LeanDojo or bring their own models to run locally or on the cloud. The tool supports platforms like Linux, macOS, and Windows WSL, with optional CUDA and cuDNN for GPU acceleration. Advanced users can customize behavior using Tactic APIs and Model APIs. Lean Copilot also allows users to bring their own models through ExternalGenerator or ExternalEncoder. The tool comes with caveats such as occasional crashes and issues with premise selection and proof search. Users can get in touch through GitHub Discussions for questions, bug reports, feature requests, and suggestions. The tool is designed to enhance theorem proving in Lean using LLMs.
python-tgpt
Python-tgpt is a Python package that enables seamless interaction with over 45 free LLM providers without requiring an API key. It also provides image generation capabilities. The name _python-tgpt_ draws inspiration from its parent project tgpt, which operates on Golang. Through this Python adaptation, users can effortlessly engage with a number of free LLMs available, fostering a smoother AI interaction experience.
datadreamer
DataDreamer is an advanced toolkit designed to facilitate the development of edge AI models by enabling synthetic data generation, knowledge extraction from pre-trained models, and creation of efficient and potent models. It eliminates the need for extensive datasets by generating synthetic datasets, leverages latent knowledge from pre-trained models, and focuses on creating compact models suitable for integration into any device and performance for specialized tasks. The toolkit offers features like prompt generation, image generation, dataset annotation, and tools for training small-scale neural networks for edge deployment. It provides hardware requirements, usage instructions, available models, and limitations to consider while using the library.
receipt-scanner
The receipt-scanner repository is an AI-Powered Receipt and Invoice Scanner for Laravel that allows users to easily extract structured receipt data from images, PDFs, and emails within their Laravel application using OpenAI. It provides a light wrapper around OpenAI Chat and Completion endpoints, supports various input formats, and integrates with Textract for OCR functionality. Users can install the package via composer, publish configuration files, and use it to extract data from plain text, PDFs, images, Word documents, and web content. The scanned receipt data is parsed into a DTO structure with main classes like Receipt, Merchant, and LineItem.
k8sgpt
K8sGPT is a tool for scanning your Kubernetes clusters, diagnosing, and triaging issues in simple English. It has SRE experience codified into its analyzers and helps to pull out the most relevant information to enrich it with AI.
ChatDBG
ChatDBG is an AI-based debugging assistant for C/C++/Python/Rust code that integrates large language models into a standard debugger (`pdb`, `lldb`, `gdb`, and `windbg`) to help debug your code. With ChatDBG, you can engage in a dialog with your debugger, asking open-ended questions about your program, like `why is x null?`. ChatDBG will _take the wheel_ and steer the debugger to answer your queries. ChatDBG can provide error diagnoses and suggest fixes. As far as we are aware, ChatDBG is the _first_ debugger to automatically perform root cause analysis and to provide suggested fixes.
evalplus
EvalPlus is a rigorous evaluation framework for LLM4Code, providing HumanEval+ and MBPP+ tests to evaluate large language models on code generation tasks. It offers precise evaluation and ranking, coding rigorousness analysis, and pre-generated code samples. Users can use EvalPlus to generate code solutions, post-process code, and evaluate code quality. The tool includes tools for code generation and test input generation using various backends.
mistral-inference
Mistral Inference repository contains minimal code to run 7B, 8x7B, and 8x22B models. It provides model download links, installation instructions, and usage guidelines for running models via CLI or Python. The repository also includes information on guardrailing, model platforms, deployment, and references. Users can interact with models through commands like mistral-demo, mistral-chat, and mistral-common. Mistral AI models support function calling and chat interactions for tasks like testing models, chatting with models, and using Codestral as a coding assistant. The repository offers detailed documentation and links to blogs for further information.
chatgpt-subtitle-translator
This tool utilizes the OpenAI ChatGPT API to translate text, with a focus on line-based translation, particularly for SRT subtitles. It optimizes token usage by removing SRT overhead and grouping text into batches, allowing for arbitrary length translations without excessive token consumption while maintaining a one-to-one match between line input and output.
ChatSim
ChatSim is a tool designed for editable scene simulation for autonomous driving via LLM-Agent collaboration. It provides functionalities for setting up the environment, installing necessary dependencies like McNeRF and Inpainting tools, and preparing data for simulation. Users can train models, simulate scenes, and track trajectories for smoother and more realistic results. The tool integrates with Blender software and offers options for training McNeRF models and McLight's skydome estimation network. It also includes a trajectory tracking module for improved trajectory tracking. ChatSim aims to facilitate the simulation of autonomous driving scenarios with collaborative LLM-Agents.
aicsimageio
AICSImageIO is a Python tool for Image Reading, Metadata Conversion, and Image Writing for Microscopy Images. It supports various file formats like OME-TIFF, TIFF, ND2, DV, CZI, LIF, PNG, GIF, and Bio-Formats. Users can read and write metadata and imaging data, work with different file systems like local paths, HTTP URLs, s3fs, and gcsfs. The tool provides functionalities for full image reading, delayed image reading, mosaic image reading, metadata reading, xarray coordinate plane attachment, cloud IO support, and saving to OME-TIFF. It also offers benchmarking and developer resources.
olah
Olah is a self-hosted lightweight Huggingface mirror service that implements mirroring feature for Huggingface resources at file block level, enhancing download speeds and saving bandwidth. It offers cache control policies and allows administrators to configure accessible repositories. Users can install Olah with pip or from source, set up the mirror site, and download models and datasets using huggingface-cli. Olah provides additional configurations through a configuration file for basic setup and accessibility restrictions. Future work includes implementing an administrator and user system, OOS backend support, and mirror update schedule task. Olah is released under the MIT License.
MindSearch
MindSearch is an open-source AI Search Engine Framework that mimics human minds to provide deep AI search capabilities. It allows users to deploy their own search engine using either close-source or open-source language models. MindSearch offers features such as answering any question using web knowledge, in-depth knowledge discovery, detailed solution paths, optimized UI experience, and dynamic graph construction process.
aim
Aim is a command-line tool for downloading and uploading files with resume support. It supports various protocols including HTTP, FTP, SFTP, SSH, and S3. Aim features an interactive mode for easy navigation and selection of files, as well as the ability to share folders over HTTP for easy access from other devices. Additionally, it offers customizable progress indicators and output formats, and can be integrated with other commands through piping. Aim can be installed via pre-built binaries or by compiling from source, and is also available as a Docker image for platform-independent usage.
chatllm.cpp
ChatLLM.cpp is a pure C++ implementation tool for real-time chatting with RAG on your computer. It supports inference of various models ranging from less than 1B to more than 300B. The tool provides accelerated memory-efficient CPU inference with quantization, optimized KV cache, and parallel computing. It allows streaming generation with a typewriter effect and continuous chatting with virtually unlimited content length. ChatLLM.cpp also offers features like Retrieval Augmented Generation (RAG), LoRA, Python/JavaScript/C bindings, web demo, and more possibilities. Users can clone the repository, quantize models, build the project using make or CMake, and run quantized models for interactive chatting.
For similar tasks
phospho
Phospho is a text analytics platform for LLM apps. It helps you detect issues and extract insights from text messages of your users or your app. You can gather user feedback, measure success, and iterate on your app to create the best conversational experience for your users.
OpenFactVerification
Loki is an open-source tool designed to automate the process of verifying the factuality of information. It provides a comprehensive pipeline for dissecting long texts into individual claims, assessing their worthiness for verification, generating queries for evidence search, crawling for evidence, and ultimately verifying the claims. This tool is especially useful for journalists, researchers, and anyone interested in the factuality of information.
open-parse
Open Parse is a Python library for visually discerning document layouts and chunking them effectively. It is designed to fill the gap in open-source libraries for handling complex documents. Unlike text splitting, which converts a file to raw text and slices it up, Open Parse visually analyzes documents for superior LLM input. It also supports basic markdown for parsing headings, bold, and italics, and has high-precision table support, extracting tables into clean Markdown formats with accuracy that surpasses traditional tools. Open Parse is extensible, allowing users to easily implement their own post-processing steps. It is also intuitive, with great editor support and completion everywhere, making it easy to use and learn.
spaCy
spaCy is an industrial-strength Natural Language Processing (NLP) library in Python and Cython. It incorporates the latest research and is designed for real-world applications. The library offers pretrained pipelines supporting 70+ languages, with advanced neural network models for tasks such as tagging, parsing, named entity recognition, and text classification. It also facilitates multi-task learning with pretrained transformers like BERT, along with a production-ready training system and streamlined model packaging, deployment, and workflow management. spaCy is commercial open-source software released under the MIT license.
NanoLLM
NanoLLM is a tool designed for optimized local inference for Large Language Models (LLMs) using HuggingFace-like APIs. It supports quantization, vision/language models, multimodal agents, speech, vector DB, and RAG. The tool aims to provide efficient and effective processing for LLMs on local devices, enhancing performance and usability for various AI applications.
ontogpt
OntoGPT is a Python package for extracting structured information from text using large language models, instruction prompts, and ontology-based grounding. It provides a command line interface and a minimal web app for easy usage. The tool has been evaluated on test data and is used in related projects like TALISMAN for gene set analysis. OntoGPT enables users to extract information from text by specifying relevant terms and provides the extracted objects as output.
lima
LIMA is a multilingual linguistic analyzer developed by the CEA LIST, LASTI laboratory. It is Free Software available under the MIT license. LIMA has state-of-the-art performance for more than 60 languages using deep learning modules. It also includes a powerful rules-based mechanism called ModEx for extracting information in new domains without annotated data.
liboai
liboai is a simple C++17 library for the OpenAI API, providing developers with access to OpenAI endpoints through a collection of methods and classes. It serves as a spiritual port of OpenAI's Python library, 'openai', with similar structure and features. The library supports various functionalities such as ChatGPT, Audio, Azure, Functions, Image DALL·E, Models, Completions, Edit, Embeddings, Files, Fine-tunes, Moderation, and Asynchronous Support. Users can easily integrate the library into their C++ projects to interact with OpenAI services.
For similar jobs
sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.
teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.
ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.
classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.
chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.
BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students
uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.
griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.