knowledge-graph-of-thoughts

Official Implementation of "Affordable AI Assistants with Knowledge Graph of Thoughts"

Stars: 139

Visit

Knowledge Graph of Thoughts (KGoT) is an innovative AI assistant architecture that integrates LLM reasoning with dynamically constructed knowledge graphs (KGs). KGoT extracts and structures task-relevant knowledge into a dynamic KG representation, iteratively enhanced through external tools such as math solvers, web crawlers, and Python scripts. Such structured representation of task-relevant knowledge enables low-cost models to solve complex tasks effectively. The KGoT system consists of three main components: the Controller, the Graph Store, and the Integrated Tools, each playing a critical role in the task-solving process.

README:

Knowledge Graph of Thoughts (KGoT)

Architecture Overview | Setup Guide | Quick Start | Citations

This is the official implementation of Affordable AI Assistants with Knowledge Graph of Thoughts.

Architecture Overview

The KGoT system is designed as a modular and flexible framework that consists of three main components: the Controller, the Graph Store, and the Integrated Tools, each playing a critical role in the task-solving process.

The Controller component offers fine-grained control over the customizable parameters in the KGoT pipeline and orchestrates the KG-based reasoning procedure.
The Graph Store component provides a modular interface for supporting various Knowledge Graph Backends. We initially support Neo4j, NetworkX and RDF4J.
The Integrated Tools component allows for flexible and extensible Tool Usage and enables the multi-modal reasoning capabilities of the framework.

Setup Guide

In order to use this framework, you need to have a working installation of Python 3.10 or newer.

Installing KGoT

Before running the installation, make sure to activate your Python environment (if any) beforehand.

git clone https://github.com/spcl/knowledge-graph-of-thoughts.git
cd knowledge-graph-of-thoughts/
pip install -e .
playwright install

Configuring API Keys and Models

To get started make a copy of the following template files inside the kgot directory:

kgot/config_llms.template.json
kgot/config_tools.template.json

Then rename them as follows:

config_llms.template.json → config_llms.json
config_tools.template.json → config_tools.json

Please update the API keys, if necessary, for the language models you intend to use in the kgot/config_llms.json file. You can also add new models by incorporating their information into the JSON file. The object key is the language model identifier used in KGoT, and the various attributes contain the information needed to run the model.

Local models are expected to be hosted using Ollama. KGoT assumes that the model is accessible at the default Ollama API endpoint (http://localhost:11434) and integrates with it through ChatOllama via the LangChain framework.

[!NOTE] Please be aware that the values for num_ctx, num_predict, and num_batch in the configuration are based on the specific GPU type and VRAM capacity used during our experiments. You may need to adjust these parameters based on your own hardware setup to avoid out-of-memory errors or suboptimal performance.

For the SurferAgent tool we rely on SerpAPI for browsing necessary external information from the Internet. To use this tool, please set the API key within the kgot/config_tools.json file.

Setting Up the Containerized Environment

In order to provide a secure & consistent execution environment, we containerize critical modules such as the Neo4j graph database, the RDF4J database and the Python Code Tool. This allows the safe execution of LLM-generated code without security concerns.

Running the Container Instances

We provide a Docker and Sarus setup for the KGoT framework. The Docker setup is recommended for local development, while Sarus is intended for HPC environments.

cd containers/

# Docker
chmod -R 777 ../kgot/knowledge_graph/_snapshots # grant permission for snapshots logging
docker compose up

# Sarus
chmod +x sarus_launcher.sh # grant permission for execution
./sarus_launcher.sh

This will build and start the default container images for KGoT, which include:

Neo4j image
Python image

cd containers/kgot/

# Docker
docker compose up

# Sarus
chmod +x sarus_launcher.sh # grant permission for execution
./sarus_launcher.sh

This will build and start:

KGoT image

[!NOTE] Further instructions on RDF4J and on customizing the container images can be found under Container Image Setup.

[!WARNING] The initial building phase of the KGoT container image can take a while (15 minutes), so be patient. If you need to make adjustments simply stop the instances and restart them with the following command docker compose up --build from the containers/kgot directory. Changes to README.md, LICENSE, pyproject.toml, kgot/__init__.py and kgot/__main__.py will cause the Docker instance to be rebuilt from scratch.

Quick Start

We primarily evaluated the Knowledge Graph of Thoughts framework with the GAIA and SimpleQA benchmarks, which we discuss first and subsequently alternative ways to run KGoT.

Datasets Download

To avoid sharing the GAIA and SimpleQA datasets in a crawlable format, we do not directly provide the datasets inside the repository. Instead, we offer a download script to help you acquire the datasets. Please refer to the download guide inside the benchmarks directory for further instructions.

Evaluating KGoT on the Datasets

We provide two run scripts for evaluating KGoT on the GAIA and SimpleQA datasets.

chmod +x ./run_multiple_gaia.sh # grant permission for logging etc.
./run_multiple_gaia.sh          # perform the actual run with default parameters

chmod +x ./run_multiple_simpleqa.sh # grant permission for logging etc.
./run_multiple_simpleqa.sh          # perform the actual run with default parameters

The following instructions apply to the run_multiple_gaia.sh script, but are also applicable to the run_multiple_simpleqa.sh script; for more detailed instructions please refer to the complete guide here.

You can run ./run_multiple_gaia.sh --help to check the supported arguments, that generally match the options found here for the gaia.py Python script. For optimal results, we recommend enabling the '--gaia_formatter' option, which will format the output in a GAIA-compatible format.

The following are the most commonly used arguments:

Arguments:
  --log_base_folder     - Directory where logs will be stored [path/to/log_folder]
  --controller_choice   - Type of solver to use               [directRetrieve/queryRetrieve]
  --backend_choice      - Backend database type               [neo4j/networkX/rdf4j]
  --tool_choice         - Tool configuration                  [tools_v2_3]
  --max_iterations      - Max iterations for KGoT             [integers > 0]
  --gaia_formatter      - Use GAIA formatter for the output   [True/False]

Example: ./run_multiple_gaia.sh --log_base_folder logs/test_1 --controller_choice directRetrieve --backend_choice networkX --tools "tools_v2_3" --max_iterations 5 --gaia_formatter

We offer three choices for storing the knowledge graph (Neo4j, NetworkX and RDF4J) as well as two choices for the retrieval type (direct and query-based retrieval).

Using Knowledge Graph of Thoughts

We offer two ways to evaluate KGoT on the datasets as well as a way to use KGoT directly with any task description.

As discussed above, you can use the run_multiple_gaia.sh or run_multiple_simpleqa.sh scripts to evaluate KGoT on the GAIA and SimpleQA datasets respectively, which act as frontends for the gaia.py and simpleqa.py Python scripts. They allow to evaluate multiple subsets of the datasets or to do multiple runs on these subsets, while also transfering the knowledge graph snapshots as well as plotting the results with various metrics. We further discuss the use of the scripts here. Please note, that if you use your own Neo4j or RDF4J server instead of the one inside the Docker container, the transfer of the knowledge graph snapshots will fail or needs to be adapted.

You can also directly run the Python script gaia.py, which we further discuss here. This Python script will however not plot the resulting data nor move the snapshots of the knowledge graph.

You can also directly use the command kgot, which is fully configurable and can be used to solve a single task:

kgot single -p "What is a knowledge graph?"

You can also, for example, select a desirable backend and pass files via the command line:

kgot --db_choice neo4j --controller_choice directRetrieve single --p "Could you summarize the content of these files for me?" --files [path/to/file1] [path/to/file2]

Citations

If you find this repository useful, please consider giving it a star! If you have any questions or feedback, don't hesitate to reach out and open an issue.

When using this in your work, please reference us with the citation provided below:

@misc{besta2025kgot,
  title = {{Affordable AI Assistants with Knowledge Graph of Thoughts}},
  author = {Besta, Maciej and Paleari, Lorenzo and Jiang, Jia Hao Andrea and Gerstenberger, Robert and Wu, You and Hannesson, Jón Gunnar and Iff, Patrick and Kubicek, Ales and Nyczyk, Piotr and Khimey, Diana and Blach, Nils and Zhang, Haiqiang and Zhang, Tao and Ma, Peiran and Kwaśniewski, Grzegorz and Copik, Marcin and Niewiadomski, Hubert and Hoefler, Torsten},
  year = 2025,
  month = Jun,
  doi = {10.48550/arXiv.2504.02670},
  url = {http://arxiv.org/abs/2504.02670},
  eprinttype = {arXiv},
  eprint = {2504.02670}
}

For Tasks:

Click tags to check more tools for each tasks

solve tasks extract knowledge enhance knowledge customize parameters orchestrate reasoning

For Jobs:

ai researcher data scientist machine learning engineer knowledge engineer software developer

Alternative AI tools for knowledge-graph-of-thoughts

Similar Open Source Tools

knowledge-graph-of-thoughts

github

: 139

lmql

LMQL is a programming language designed for large language models (LLMs) that offers a unique way of integrating traditional programming with LLM interaction. It allows users to write programs that combine algorithmic logic with LLM calls, enabling model reasoning capabilities within the context of the program. LMQL provides features such as Python syntax integration, rich control-flow options, advanced decoding techniques, powerful constraints via logit masking, runtime optimization, sync and async API support, multi-model compatibility, and extensive applications like JSON decoding and interactive chat interfaces. The tool also offers library integration, flexible tooling, and output streaming options for easy model output handling.

github

: 3.4k

guidellm

GuideLLM is a platform for evaluating and optimizing the deployment of large language models (LLMs). By simulating real-world inference workloads, GuideLLM enables users to assess the performance, resource requirements, and cost implications of deploying LLMs on various hardware configurations. This approach ensures efficient, scalable, and cost-effective LLM inference serving while maintaining high service quality. The tool provides features for performance evaluation, resource optimization, cost estimation, and scalability testing.

github

: 570

mosec

Mosec is a high-performance and flexible model serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API. * **Highly performant** : web layer and task coordination built with Rust 🦀, which offers blazing speed in addition to efficient CPU utilization powered by async I/O * **Ease of use** : user interface purely in Python 🐍, by which users can serve their models in an ML framework-agnostic manner using the same code as they do for offline testing * **Dynamic batching** : aggregate requests from different users for batched inference and distribute results back * **Pipelined stages** : spawn multiple processes for pipelined stages to handle CPU/GPU/IO mixed workloads * **Cloud friendly** : designed to run in the cloud, with the model warmup, graceful shutdown, and Prometheus monitoring metrics, easily managed by Kubernetes or any container orchestration systems * **Do one thing well** : focus on the online serving part, users can pay attention to the model optimization and business logic

github

: 834

visualwebarena

VisualWebArena is a benchmark for evaluating multimodal autonomous language agents through diverse and complex web-based visual tasks. It builds on the reproducible evaluation introduced in WebArena. The repository provides scripts for end-to-end training, demos to run multimodal agents on webpages, and tools for setting up environments for evaluation. It includes trajectories of the GPT-4V + SoM agent on VWA tasks, along with human evaluations on 233 tasks. The environment supports OpenAI models and Gemini models for evaluation.

github

: 157

OlympicArena

OlympicArena is a comprehensive benchmark designed to evaluate advanced AI capabilities across various disciplines. It aims to push AI towards superintelligence by tackling complex challenges in science and beyond. The repository provides detailed data for different disciplines, allows users to run inference and evaluation locally, and offers a submission platform for testing models on the test set. Additionally, it includes an annotation interface and encourages users to cite their paper if they find the code or dataset helpful.

github

: 74

warc-gpt

WARC-GPT is an experimental retrieval augmented generation pipeline for web archive collections. It allows users to interact with WARC files, extract text, generate text embeddings, visualize embeddings, and interact with a web UI and API. The tool is highly customizable, supporting various LLMs, providers, and embedding models. Users can configure the application using environment variables, ingest WARC files, start the server, and interact with the web UI and API to search for content and generate text completions. WARC-GPT is designed for exploration and experimentation in exploring web archives using AI.

github

: 219

vulnerability-analysis

The NVIDIA AI Blueprint for Vulnerability Analysis for Container Security showcases accelerated analysis on common vulnerabilities and exposures (CVE) at an enterprise scale, reducing mitigation time from days to seconds. It enables security analysts to determine software package vulnerabilities using large language models (LLMs) and retrieval-augmented generation (RAG). The blueprint is designed for security analysts, IT engineers, and AI practitioners in cybersecurity. It requires NVAIE developer license and API keys for vulnerability databases, search engines, and LLM model services. Hardware requirements include L40 GPU for pipeline operation and optional LLM NIM and Embedding NIM. The workflow involves LLM pipeline for CVE impact analysis, utilizing LLM planner, agent, and summarization nodes. The blueprint uses NVIDIA NIM microservices and Morpheus Cybersecurity AI SDK for vulnerability analysis.

github

: 86

MARS5-TTS

MARS5 is a novel English speech model (TTS) developed by CAMB.AI, featuring a two-stage AR-NAR pipeline with a unique NAR component. The model can generate speech for various scenarios like sports commentary and anime with just 5 seconds of audio and a text snippet. It allows steering prosody using punctuation and capitalization in the transcript. Speaker identity is specified using an audio reference file, enabling 'deep clone' for improved quality. The model can be used via torch.hub or HuggingFace, supporting both shallow and deep cloning for inference. Checkpoints are provided for AR and NAR models, with hardware requirements of 750M+450M params on GPU. Contributions to improve model stability, performance, and reference audio selection are welcome.

github

: 2.1k

AgentIQ

AgentIQ is a flexible library designed to seamlessly integrate enterprise agents with various data sources and tools. It enables true composability by treating agents, tools, and workflows as simple function calls. With features like framework agnosticism, reusability, rapid development, profiling, observability, evaluation system, user interface, and MCP compatibility, AgentIQ empowers developers to move quickly, experiment freely, and ensure reliability across agent-driven projects.

github

: 445

unitycatalog

Unity Catalog is an open and interoperable catalog for data and AI, supporting multi-format tables, unstructured data, and AI assets. It offers plugin support for extensibility and interoperates with Delta Sharing protocol. The catalog is fully open with OpenAPI spec and OSS implementation, providing unified governance for data and AI with asset-level access control enforced through REST APIs.

github

: 2.8k

geti-sdk

The Intel® Geti™ SDK is a python package that enables teams to rapidly develop AI models by easing the complexities of model development and enhancing collaboration between teams. It provides tools to interact with an Intel® Geti™ server via the REST API, allowing for project creation, downloading, uploading, deploying for local inference with OpenVINO, setting project and model configuration, launching and monitoring training jobs, and media upload and prediction. The SDK also includes tutorial-style Jupyter notebooks demonstrating its usage.

github

: 74

geti-sdk

The Intel® Geti™ SDK is a python package that enables teams to rapidly develop AI models by easing the complexities of model development and fostering collaboration. It provides tools to interact with an Intel® Geti™ server via the REST API, allowing for project creation, downloading, uploading, deploying for local inference with OpenVINO, configuration management, training job monitoring, media upload, and prediction. The repository also includes tutorial-style Jupyter notebooks demonstrating SDK usage.

github

: 78

genai-toolbox

Gen AI Toolbox for Databases is an open source server that simplifies building Gen AI tools for interacting with databases. It handles complexities like connection pooling, authentication, and more, enabling easier, faster, and more secure tool development. The toolbox sits between the application's orchestration framework and the database, providing a control plane to modify, distribute, or invoke tools. It offers simplified development, better performance, enhanced security, and end-to-end observability. Users can install the toolbox as a binary, container image, or compile from source. Configuration is done through a 'tools.yaml' file, defining sources, tools, and toolsets. The project follows semantic versioning and welcomes contributions.

github

: 539

LLMeBench

LLMeBench is a flexible framework designed for accelerating benchmarking of Large Language Models (LLMs) in the field of Natural Language Processing (NLP). It supports evaluation of various NLP tasks using model providers like OpenAI, HuggingFace Inference API, and Petals. The framework is customizable for different NLP tasks, LLM models, and datasets across multiple languages. It features extensive caching capabilities, supports zero- and few-shot learning paradigms, and allows on-the-fly dataset download and caching. LLMeBench is open-source and continuously expanding to support new models accessible through APIs.

github

: 94

civitai

Civitai is a platform where people can share their stable diffusion models (textual inversions, hypernetworks, aesthetic gradients, VAEs, and any other crazy stuff people do to customize their AI generations), collaborate with others to improve them, and learn from each other's work. The platform allows users to create an account, upload their models, and browse models that have been shared by others. Users can also leave comments and feedback on each other's models to facilitate collaboration and knowledge sharing.

github

: 6.8k

For similar tasks

GeminiChatUp

Gemini ChatUp is a chat application utilizing the Google GeminiPro API Key. It supports responsive layout and can store multiple sets of conversations with customizable parameters for each set. Users can log in with a test account or provide their own API Key to deploy the feature. The application also offers user authentication through Edge config in Vercel, allowing users to add usernames and passwords in JSON format. Local deployment is possible by installing dependencies, setting up environment variables, and running the application locally.

github

: 88

knowledge-graph-of-thoughts

github

: 139

ai-tech-interview

This repository contains a collection of interview questions related to various topics such as statistics, machine learning, deep learning, Python, networking, operating systems, data structures, and algorithms. The questions cover a wide range of concepts and are suitable for individuals preparing for technical interviews in the field of artificial intelligence and data science.

github

: 1.8k

free-ai-tips

Free AI Tips is a GitHub repository that provides weekly tips on Generative AI and Machine Learning. Users can register to receive these tips for free. The repository aims to offer valuable insights and knowledge in the field of AI and ML to help individuals enhance their skills and stay updated with the latest trends and developments.

github

: 344

ragflow

RAGFlow is an open-source Retrieval-Augmented Generation (RAG) engine that combines deep document understanding with Large Language Models (LLMs) to provide accurate question-answering capabilities. It offers a streamlined RAG workflow for businesses of all sizes, enabling them to extract knowledge from unstructured data in various formats, including Word documents, slides, Excel files, images, and more. RAGFlow's key features include deep document understanding, template-based chunking, grounded citations with reduced hallucinations, compatibility with heterogeneous data sources, and an automated and effortless RAG workflow. It supports multiple recall paired with fused re-ranking, configurable LLMs and embedding models, and intuitive APIs for seamless integration with business applications.

github

: 65.3k

Agent

Agent is a RustSBI specialized domain knowledge quiz LLM tool that extracts domain knowledge from various sources such as Rust Documentation, RISC-V Documentation, Bouffalo Docs, Bouffalo SDK, and Xiangshan Docs. It also provides resources for LLM prompt engineering and RAG engineering, including guides and existing projects related to retrieval-augmented generation (RAG) systems.

github

: 101

vault-ai

OP Vault is a tool that leverages the OP Stack (OpenAI + Pinecone Vector Database) to allow users to upload custom knowledgebase files and ask questions about their contents. It provides a user-friendly Golang server and React frontend for querying human-readable content like books and documents, making it valuable for knowledge extraction and question-answering. Users can upload entire libraries, receive specific answers with file and section references, and explore the power of the OP Stack in a practical interface.

github

: 3.3k

Awesome-AI-Agents

Awesome-AI-Agents is a curated list of projects, frameworks, benchmarks, platforms, and related resources focused on autonomous AI agents powered by Large Language Models (LLMs). The repository showcases a wide range of applications, multi-agent task solver projects, agent society simulations, and advanced components for building and customizing AI agents. It also includes frameworks for orchestrating role-playing, evaluating LLM-as-Agent performance, and connecting LLMs with real-world applications through platforms and APIs. Additionally, the repository features surveys, paper lists, and blogs related to LLM-based autonomous agents, making it a valuable resource for researchers, developers, and enthusiasts in the field of AI.

github

: 526

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 980

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.9k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 32.1k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675