
LLM-KG4QA
LLM-KG4QA: Large Language Models and Knowledge Graphs for Question Answering
Stars: 80

LLM-KG4QA is a repository focused on the integration of Large Language Models (LLMs) and Knowledge Graphs (KGs) for Question Answering (QA). It covers various aspects such as using KGs as background knowledge, reasoning guideline, and refiner/filter. The repository provides detailed information on pre-training, fine-tuning, and Retrieval Augmented Generation (RAG) techniques for enhancing QA performance. It also explores complex QA tasks like Explainable QA, Multi-Modal QA, Multi-Document QA, Multi-Hop QA, Multi-run and Conversational QA, Temporal QA, Multi-domain and Multilingual QA, along with advanced topics like Optimization and Data Management. Additionally, it includes benchmark datasets, industrial and scientific applications, demos, and related surveys in the field.
README:
-
2025-08
Our survey has been accepted to EMNLP2025 main conference. -
2025-05
The preprint of our extended survey is avaliable on arXiv. -
2025-02
Our tutorial was accepted to be presented at EDBT2025. -
2024-12
We create this repository to maintain a paper list on LLMs and KGs for QA.
If you find our work is useful, please cite our paper by using the following BibTeX entry.
@article{ma2025llmkg4qa,
title={Large Language Models Meet Knowledge Graphs for Question Answering: Synthesis and Opportunities},
author={Ma, Chuangtao and Chen, Yongrui and Wu, Tianxing and Khan, Arijit and Wang, Haofen},
journal={arXiv preprint arXiv:2505.20099},
year={2025}
}
NO | Title | Venue | Year | Category | Paper Link |
---|---|---|---|---|---|
1 | Deep Bidirectional Language-Knowledge Graph pretraining | NeurIPS | 2022 | Pre-training | Link |
2 | GreaseLM: Graph REASoning Enhanced Language Models | ICLR | 2022 | Pre-training | Link |
3 | InfuserKI: Enhancing Large Language Models with Knowledge Graphs via Infuser-Guided Knowledge Integration | LLM+KG@VLDB | 2024 | Pre-training | Link |
4 | Large Language Models Meet Knowledge Graphs to Answer Factoid Questions | PACLIC | 2023 | Pre-training | Link |
5 | KaLM: Knowledge-aligned Autoregressive Language Modeling via Dual-view Knowledge Graph Contrastive Learning | arXiv | 2024 | Pre-training | Link |
6 | KBLaM: Knowledge Base augmented Language Model | ICLR | 2025 | Pre-training | Link |
7 | KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation | NAACL | 2024 | Fine-Tuning | Link |
8 | KG-Adapter: Enabling Knowledge Graph Integration in Large Language Models through Parameter-Efficient Fine-Tuning | ACL Findlings | 2024 | Fine-Tuning | Link |
9 | A GAIL Fine-Tuned LLM Enhanced Framework for Low-Resource Knowledge Graph Question Answering | CIKM | 2024 | Fine-Tuning | Link |
10 | Knowledge Graph Finetuning Enhances Knowledge Manipulation in Large Language Models | ICLR | 2025 | Fine-Tuning | Link |
11 | KLearn Together: Joint Multitask Finetuning of Pretrained KG-enhanced LLM for Downstream Tasks | GenAIK@COLING | 2025 | Fine-Tuning | Link |
12 | Improving Pre-trained Language Models with Knowledge Enhancement and Filtering Framework | NAACL Findings | 2025 | Fine-Tuning | Link |
13 | Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts | arXiv | 2024 | KG-Augmented Prompting | Link |
14 | KnowGPT: Knowledge Graph based Prompting for Large Language Models | arXiv | 2024 | KG-Augmented Prompting | Link |
15 | Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering | NLRSE | 2023 | KG-Augmented Prompting | Link |
16 | Retrieve-Rewrite-Answer: A KG-to-Text Enhanced LLMs Framework for Knowledge Graph Question Answering | IJCKG | 2023 | KG-Augmented Prompting | Link |
17 | Mitigating LLM Hallucinations with Knowledge Graphs: A Case Study | arXiv | 2025 | KG-Augmented Prompting | Link |
NO | Title | Venue | Year | Category | Paper Link |
---|---|---|---|---|---|
1 | Enhancing Textbook Question Answering Task with Large Language Models and Retrieval Augmented Generation | arXiv | 2024 | RAG | Link |
2 | Retrieval-enhanced Knowledge Editing in Language Models for Multi-Hop Question Answering | CIKM | 2024 | RAG | Link |
3 | Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation | arXiv | 2024 | RAG | Link |
4 | RAG-based Question Answering over Heterogeneous Data and Text | arXiv | 2024 | RAG | Link |
5 | Awakening Augmented Generation: Learning to Awaken Internal Knowledge of Large Language Models for Question Answering | COLING | 2025 | RAG | Link |
6 | SAGE: A Framework of Precise Retrieval for RAG | arXiv | 2025 | RAG | Link |
7 | From Local to Global: A Graph RAG Approach to Query-Focused Summarization | arXiv | 2024 | Graph RAG | Link |
8 | LightRAG: Simple and Fast Retrieval-Augmented Generatio | arXiv | 2024 | Graph RAG | Link |
9 | GRAG: Graph Retrieval-Augmented Generation | arXiv | 2024 | Graph RAG | Link |
10 | HybGRAG: Hybrid Retrieval-Augmented Generation on Textual and Relational Knowledge Bases | arXiv | 2024 | Graph RAG | Link |
11 | CG-RAG: Research Question Answering by Citation Graph Retrieval-Augmented LLMs | arXiv | 2025 | Graph RAG | Link |
12 | MiniRAG: Towards Extremely Simple Retrieval-Augmented Generation | arXiv | 2025 | Graph RAG | Link |
13 | GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation | arXiv | 2025 | Graph RAG | Link |
14 | MSG-LLM: A Multi-scale Interactive Framework for Graph-enhanced Large Language Models | COLING | 2025 | Graph RAG | Link |
15 | PathRAG: Pruning Graph-based Retrieval Augmented Generation with Relational Paths | arXiv | 2025 | Graph RAG | Link |
16 | In-depth Analysis of Graph-based RAG in a Unified Framework | arXiv | 2025 | Graph RAG | Link |
17 | Empowering GraphRAG with Knowledge Filtering and Integration | arXiv | 2025 | Graph RAG | Link |
18 | Graph Neural Network Enhanced Retrieval for Question Answering of Large Language Models | NAACL | 2025 | Graph RAG | Link |
19 | NodeRAG: Structuring Graph-based RAG with Heterogeneous Nodes | arXiv | 2025 | Graph RAG | Link |
20 | KG-RAG: Bridging the Gap Between Knowledge and Creativity | arXiv | 2024 | KG RAG | Link |
21 | Knowledge Graph-extended Retrieval Augmented Generation for Question Answering | arXiv | 2025 | KG RAG | Link |
22 | Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering | SIGIR | 2024 | KG RAG | Link |
23 | REnhancing Large Language Models with Knowledge Graphs for Robust Question Answering | ICPADS | 2024 | KG RAG | Link |
24 | FRAG: A Flexible Modular Framework for Retrieval-Augmented Generation based on Knowledge Graphs | arXiv | 2025 | KG RAG | Link |
25 | SimGRAG: Leveraging Similar Subgraphs for Knowledge Graphs Driven Retrieval-Augmented Generation | arXiv | 2025 | KG RAG | Link |
26 | RGR-KBQA: Generating Logical Forms for Question Answering Using Knowledge-Graph-Enhanced Large Language Model | COLING | 2025 | KG RAG | Link |
27 | Knowledge Graph-Guided Retrieval Augmented Generation | arXiv | 2025 | KG RAG | Link |
28 | Simple Is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation | ICLR | 2025 | KG RAG | Link |
29 | CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models | arXiv | 2025 | KG RAG | Link |
30 | A Pilot Empirical Study on When and How to Use Knowledge Graphs as Retrieval Augmented Generation | arXiv | 2025 | KG RAG | Link |
31 | A Systematic Exploration of Knowledge Graph Alignment with Large Language Models in Retrieval Augmented Generation | AAAI | 2025 | KG RAG | Link |
32 | Evaluating Knowledge Graph Based Retrieval Augmented Generation Methods under Knowledge Incompleteness | arXiv | 2025 | KG RAG | Link |
33 | RAG-KG-IL: A Multi-Agent Hybrid Framework for Reducing Hallucinations and Enhancing LLM Reasoning through RAG and Incremental Knowledge Graph Learning Integration | arXiv | 2025 | KG RAG | Link |
34 | Empowering LLMs by hybrid retrieval-augmented generation for domain-centric Q&A in smart manufacturing | Advanced Engineering Informatics | 2025 | Hybrid RAG | Link |
35 | Spatial-RAG: Spatial Retrieval Augmented Generation for Real-World Spatial Reasoning Questions | arXiv | 2025 | Spatial RAG | Link |
36 | Spatial-RAG: Spatial Retrieval Augmented Generation for Real-World Spatial Reasoning Questions | arXiv | 2025 | Spatial RAG | Link |
NO | Title | Venue | Year | Category | Paper Link |
---|---|---|---|---|---|
1 | Subgraph Retrieval Enhanced Model for Multi-hop Knowledge Base Question Answerings | ACL | 2022 | Offline KG Guidelines | Link |
2 | keqing: knowledge-based question answering is a nature chain-of-thought mentor of LLM | arXiv | 2023 | Offline KG Guidelines | Link |
3 | Explore then Determine: A GNN-LLM Synergy Framework for Reasoning over Knowledge Graph | arXiv | 2024 | Offline KG Guidelines | Link |
4 | Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models | arXiv | 2024 | Offline KG Guidelines | Link |
5 | Reasoning with Trees: Faithful Question Answering over Knowledge Graph | COLING | 2025 | Offline KG Guidelines | Link |
6 | Empowering Language Models with Knowledge Graph Reasoning for Open-Domain Question Answering | EMNLP | 2022 | Online KG Guildlines | Link |
7 | Knowledge-Enhanced Iterative Instruction Generation and Reasoning for Knowledge Base Question Answering | NLPCC | 2022 | Online KG Guildlines | Link |
8 | Evaluating and Enhancing Large Language Models for Conversational Reasoning on Knowledge Graphs | arXiv | 2023 | Online KG Guildlines | Link |
9 | Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge Graph | ICLR | 2024 | Online KG Guildlines | Link |
10 | Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation | ICLR | 2024 | Online KG Guildlines | Link |
11 | KARPA: A Training-free Method of Adapting Knowledge Graph as References for Large Language Model's Reasoning Path Aggregation | arXiv | 2024 | Online KG Guildlines | Link |
12 | Retrieval and Reasoning on KGs: Integrate Knowledge Graphs into Large Language Models for Complex Question Answering | EMNLP | 2024 | Online KG Guildlines | Link |
13 | HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models | NeurIPS | 2024 | Online KG Guildlines | Link |
14 | KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Grap | arXiv | 2024 | Agent-based KG Guildlines | Link |
15 | ODA: Observation-Driven Agent for integrating LLMs and Knowledge Graphs | ACL Findings | 2024 | Agent-based KG Guildlines | Link |
16 | Plan-on-graph: self-correcting adaptive planning of large language model on knowledge graphs | NeruIPS | 2024 | Agent-based KG Guildlines | Link |
17 | AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning | KDD | 2025 | Agent-based KG Guildlines | Link |
18 | Youtu-GraphRAG: Vertically Unified Agents for Graph Retrieval-Augmented Complex Reasoning | arXiv | 2025 | Agent-based KG Guildlines | Link |
19 | A Collaborative Reasoning Framework Powered by Reinforcement Learning and Large Language Models for Complex Questions Answering over Knowledge Graph | COLING | 2025 | Collaborative Reasoning | Link |
20 | Rule-KBQA: Rule-Guided Reasoning for Complex Knowledge Base Question Answering with Large Language Models | COLING | 2025 | Rule-Guided Reasoning | Link |
NO | Title | Venue | Year | Category | Paper Link |
---|---|---|---|---|---|
1 | Answer Candidate Type Selection: Text-To-Text Language Model for Closed Book Question Answering Meets Knowledge Graphs | KONVENS | 2023 | KG-Driven Filtering and Validation | Link |
2 | KG-Rank: Enhancing Large Language Models for Medical QA with Knowledge Graphs and Ranking Techniques | BioNLP Workshop | 2024 | KG-Driven Filtering and Validation | Link |
3 | Mitigating Large Language Model Hallucinations via Autonomous Knowledge Graph-based Retrofitting | AAAI | 2024 | KG-Driven Filtering and Validation | Link |
4 | Evidence-Focused Fact Summarization for Knowledge-Augmented Zero-Shot Question Answering | ariXv | 2024 | KG-Augmented Output Refinement | Link |
5 | Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language Models | ACL | 2024 | KG-Augmented Output Refinement | Link |
6 | Learning to Plan for Retrieval-Augmented Large Language Models from Knowledge Graphs | arXiv | 2024 | KG-Augmented Output Refinement | Link |
7 | Optimizing Knowledge Integration in Retrieval-Augmented Generation with Self-Selection | arXiv | 2025 | RAG-based Answers Selection | Link |
NO | Title | Venue | Year | Category | Paper Link |
---|---|---|---|---|---|
1 | Reasoning over Hierarchical Question Decomposition Tree for Explainable Question Answering | ACL | 2024 | - | Link |
2 | Explainable Conversational Question Answering over Heterogeneous Sources via Iterative Graph Neural Networks | SIGIR | 2023 | - | Link |
3 | Retrieval In Decoder benefits generative models for explainable complex question answering | Neural Networks | 2025 | - | Link |
NO | Title | Venue | Year | Category | Paper Link |
---|---|---|---|---|---|
1 | Lako: Knowledge-driven visual question answering via late knowledge-to text injection | IJCKG | 2022 | VQA | Link |
2 | Modality-Aware Integration with Large Language Models for Knowledge-Based Visual Question Answering | ACL | 2024 | VQA | Link |
3 | Knowledge-Enhanced Visual Question Answering with Multi-modal Joint Guidance | JCKG | 2024 | VQA | Link |
4 | ReasVQA: Advancing VideoQA with Imperfect Reasoning Process | arXiv | 2025 | VQA | Link |
5 | Fine-grained knowledge fusion for retrieval-augmented medical visual question answering | Information Fusion | 2025 | VQA | Link |
6 | RAMQA: A Unified Framework for Retrieval-Augmented Multi-Modal Question Answering | arXiv | 2025 | Multi-Modal QA | Link |
7 | MuRAR: A Simple and Effective Multimodal Retrieval and Answer Refinement Framework for Multimodal Question Answering | arXiv | 2024 | Multi-Modal QA | Link |
NO | Title | Venue | Year | Category | Paper Link |
---|---|---|---|---|---|
1 | Knowledge Graph Prompting for Multi-Document Question Answering | AAAI | 2024 | Multi-doc QA | Link |
2 | CuriousLLM: Elevating Multi-Document QA with Reasoning-Infused Knowledge Graph Prompting | arXiv | 2024 | Multi-doc QA | Link |
3 | VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation | arXiv | 2024 | Multi-doc QA | Link |
NO | Title | Venue | Year | Category | Paper Link |
---|---|---|---|---|---|
1 | GraphLLM: A General Framework for Multi-hop Question Answering over Knowledge Graphs Using Large Language Models | NLPCC | 2024 | Multi-Hop QA | Link |
2 | LLM-KGMQA: Large Language Model-Augmented Multi-Hop Question-Answering System based on Knowledge Graph in Medical Field | KBS | 2024 | Multi-Hop QA | Link |
3 | PokeMQA: Programmable knowledge editing for Multi-hop Question Answering | ACL | 2024 | Multi-Hop QA | Link |
4 | HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs | ACL | 2024 | Multi-Hop QA | Link |
5 | LLM-Based Multi-Hop Question Answering with Knowledge Graph Integration in Evolving Environments | EMNLP | 2024 | Multi-Hop QA | Link |
6 | SG-RAG: Multi-Hop Question Answering With Large Language Models Through Knowledge Graphs | ICNLSP | 2024 | Multi-Hop QA | Link |
7 | From Superficial to Deep: Integrating External Knowledge for Follow-up Question Generation Using Knowledge Graph and LLM | COLING | 2025 | Multi-Hop QA | Link |
8 | Multi-Hop Question Answering with LLMs & Knowledge Graphs | Blog | 2023 | Multi-Hop QA | Link |
9 | Mitigating Lost-in-Retrieval Problems in Retrieval Augmented Multi-Hop Question Answering | arXiv | 2025 | Multi-Hop QA | Link |
10 | Knowledge Graph Based Retrieval-Augmented Generation for Multi-Hop Question Answering Enhancement | IEEE IKT | 2024 | Multi-Hop QA | Link |
11 | A Framework of Knowledge Graph-Enhanced Large Language Model Based on Question Decomposition and Atomic Retrieval | EMNLP Findings | 2024 | Multi-Hop QA | Link |
NO | Title | Venue | Year | Category | Paper Link |
---|---|---|---|---|---|
1 | Explainable Conversational Question Answering over Heterogeneous Sources via Iterative Graph Neural Networks | SIGIR | 2023 | Conversational QA | Link |
2 | Conversational Question Answering with Language Models Generated Reformulations over Knowledge Graph | ACL Findings | 2024 | Conversational QA | Link |
3 | LLM-Based Multi-Hop Question Answering with Knowledge Graph Integration in Evolving Environments | EMNLP | 2024 | Multi-Hop QA | Link |
4 | Learning When to Retrieve, What to Rewrite, and How to Respond in Conversational QA | EMNLP | 2024 | Conversational QA | Link |
5 | ConvKGYarn: Spinning Configurable and Scalable Conversational Knowledge Graph QA Datasets with Large Language Models | EMNLP | 2024 | Conversational QA | Link |
6 | Dialogue Benchmark Generation from Knowledge Graphs with Cost-Effective Retrieval-Augmented LLMs | SIGMOD | 2025 | Dialogue | Link |
NO | Title | Venue | Year | Category | Paper Link |
---|---|---|---|---|---|
1 | KG-IRAG: A Knowledge Graph-Based Iterative Retrieval-Augmented Generation Framework for Temporal Reasoning | arXiv | 2025 | Temporal QA | Link |
2 | Two-stage Generative Question Answering on Temporal Knowledge Graph Using Large Language Models | ACL Findings | 2024 | Temporal QA | Link |
3 | TimeR4 : Time-aware Retrieval-Augmented Large Language Models for Temporal Knowledge Graph Question Answering | EMNLP | 2024 | Temporal QA | Link |
NO | Title | Venue | Year | Category | Paper Link |
---|---|---|---|---|---|
1 | MKG-Rank: Enhancing Large Language Models with Knowledge Graph for Multilingual Medical Question Answering | arXiv | 2025 | Multilingual QA | Link |
2 | Language Models as SPARQL Query Filtering for Improving the Quality of Multilingual Question Answering over Knowledge Graphs | IWCE | 2024 | Multilingual QA | Link |
NO | Title | Venue | Year | Category | Paper Link |
---|---|---|---|---|---|
1 | Empowering Large Language Models to Set up a Knowledge Retrieval Indexer via Self-Learning | arXiv | 2023 | Index-based Optimization | Link |
2 | Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs | ICLR | 2024 | Index-based Optimization | Link |
3 | KG-Retriever: Efficient Knowledge Indexing for Retrieval-Augmented Large Language Models | arXiv | 2024 | Index-based Optimization | Link |
4 | Prompting Is Programming: A Query Language for Large Language Models | PLDL | 2023 | Prompting-based Optimization | Link |
5 | LLM as Prompter: Low-resource Inductive Reasoning on Arbitrary Knowledge Graphs | ACL Findings | 2024 | Prompting-based Optimization | Link |
6 | LightRAG: Simple and Fast Retrieval-Augmented Generation | arXiv | 2024 | Graph retrieval-based optimization | Link |
7 | Clue-Guided Path Exploration: Optimizing Knowledge Graph Retrieval with Large Language Models to Address the Information Black Box Challenge | arXiv | 2024 | Graph retrieval-based optimization | Link |
8 | Optimizing open-domain question answering with graph-based retrieval augmented generation | arXiv | 2025 | Graph retrieval-based optimization | Link |
9 | Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation | WWW | 2025 | Graph retrieval-based optimization | Link |
10 | Optimizing Knowledge Integration in Retrieval-Augmented Generation with Self-Selection | arXiv | 2025 | Graph retrieval-based optimization | Link |
11 | Systematic Knowledge Injection into Large Language Models via Diverse Augmentation for Domain-Specific RAG | arXiv | 2025 | Graph retrieval-based optimization | Link |
12 | KG-Rank: Enhancing Large Language Models for Medical QA with Knowledge Graphs and Ranking Techniques | BioNLP Workshop | 2024 | Ranking-based optimization | Link |
13 | KS-LLM: Knowledge Selection of Large Language Models with Evidence Document for Question Answering | arXiv | 2024 | Ranking-based optimization | Link |
14 | RAG-based Question Answering over Heterogeneous Data and Text | arXiv | 2024 | Ranking-based optimization | Link |
15 | Cost-efficient Knowledge-based Question Answering with Large Language Models | arXiv | 2024 | Cost-based optimization | Link |
16 | KGLens: Towards Efficient and Effective Knowledge Probing of Large Language Models with Knowledge Graphs | arXiv | 2024 | Cost-based optimization | Link |
17 | Knowledge Graph-Enhanced Large Language Models via Path Selection | ACL Findings | 2024 | Path-based optimization | Link |
18 | LEGO-GraphRAG: Modularizing Graph-based Retrieval-Augmented Generation for Design Space Exploration | arXiv | 2024 | Path-based optimization | Link |
19 | Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models | arXiv | 2024 | Query-based optimization | Link |
20 | A MapReduce Approach to Effectively Utilize Long Context Information in Retrieval Augmented Language Models | arXiv | 2024 | MapReduce-based optimization | Link |
21 | PIP-KAG: Mitigating Knowledge Conflicts in Knowledge-Augmented Generation via Parametric Pruning | arXiv | 2025 | Knowledge conflicts mitigation | Link |
22 | Direct Retrieval-augmented Optimization: Synergizing Knowledge Selection and Language Models | arXiv | 2025 | Knowledge conflicts mitigation | Link |
NO | Title | Venue | Year | Category | Paper Link |
---|---|---|---|---|---|
1 | Triple Augmented Generative Language Models for SPARQL Query Generation from Natural Language Questions | arXiv | 2024 | NL2GQL | Link |
2 | R3-NL2GQL: A Model Coordination and Knowledge Graph Alignment Approach for NL2GQL | ACL Findings | 2024 | NL2GQL | Link |
3 | Aligning Large Language Models to a Domain-specific Graph Database for NL2GQL | CIKM | 2024 | NL2GQL | Link |
4 | UniOQA: A Unified Framework for Knowledge Graph Question Answering with Large Language Models | arXiv | 2024 | NL2GQL | Link |
5 | NAT-NL2GQL: A Novel Multi-Agent Framework for Translating Natural Language to Graph Query Language | arXiv | 2024 | NL2GQL | Link |
6 | CypherBench: Towards Precise Retrieval over Full-scale Modern Knowledge Graphs in the LLM Era | arXiv | 2024 | NL2GQL | Link |
7 | SpCQL: A Semantic Parsing Dataset for Converting Natural Language into Cypher | CIKM | 2022 | NL2GQL | Link |
8 | Robust Text-to-Cypher Using Combination of BERT, GraphSAGE, and Transformer (CoBGT) Model | Applied Sciences | 2024 | NL2GQL | Link |
9 | Real-Time Text-to-Cypher Query Generation with Large Language Models for Graph Databases | Future Internet | 2024 | NL2GQL | Link |
10 | LLM4QA: Leveraging Large Language Model for Efficient Knowledge Graph Reasoning with SPARQL Query | JAIT | 2024 | NL2GQL | Link |
11 | Text to Graph Query Using Filter Condition Attributes | LSGDA@VLDB | 2024 | NL2GQL | Link |
12 | Text-to-CQL Based on Large Language Model and Graph Pattern Enhancement | PRML | 2024 | NL2GQL | Link |
13 | Demystifying Natural Language to Cypher Conversion with OpenAI, Neo4j, LangChain, and LangSmith | Blog | 2024 | NL2GQL | Link |
14 | Text2Cypher, the beginning of the Graph + LLM stack | Blog | 2023 | NL2GQL | Link |
15 | Text2Cypher - Natural Language Queries | Blog | 2023 | NL2GQL | Link |
16 | A Framework for Question Answering on Knowledge Graphs Using Large Language Models | ESWC | 2024 | NL2GQL | Link |
17 | LLaSA: Large Language and Structured Data Assistant | arXiv | 2024 | Structured Data Assistant | Link |
18 | GraphRAG and role of Graph Databases in Advancing AI | IJRCAIT | 2024 | Graph DB | Link |
19 | TigerVector: Supporting Vector Search in Graph Databases for Advanced RAGs | arXiv | 2025 | Graph DB | Link |
20 | Increasing Accuracy of LLM-powered Question Answering on SQL databases: Knowledge Graphs to the Rescue | Data Engineering Bulletin | 2024 | RDB QA | Link |
21 | Symphony: Towards Trustworthy Question Answering and Verification using RAG over Multimodal Data Lakes | Data Engineering Bulletin | 2024 | RDB QA | Link |
22 | Increasing the LLM Accuracy for Question Answering: Ontologies to the Rescue! | arXiv | 2024 | RDB QA | Link |
23 | GTR: Graph-Table-RAG for Cross-Table Question Answering | arXiv | 2025 | RDB QA | Link |
24 | ER-RAG: Enhance RAG with ER-Based Unified Modeling of Heterogeneous Data Sources | arXiv | 2025 | RDB QA | Link |
NO | Title | Venue | Year | Dataset | Category | Paper Link |
---|---|---|---|---|---|---|
1 | The Value of Semantic Parse Labeling for Knowledge Base Question Answering | ACL | 2016 | WebQSP | KBQA and KGQA | Link |
2 | Benchmarking Large Language Models in Complex Question Answering Attribution using Knowledge Graphs | arXiv | 2024 | CAQA | KBQA and KGQA | Link |
3 | G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering | NeurIPS | 2024 | GraphQA | KBQA and KGQA | Link |
4 | Automatic Question-Answer Generation for Long-Tail Knowledge | KnowledgeNL@KDD | 2023 | Long-tail QA | KBQA and KGQA | Link |
5 | BioASQ-QA: A manually curated corpus for Biomedical Question Answering | Scientific Data | 2023 | BioASQ-QA | KBQA and KGQA | Link |
6 | HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering | EMNLP | 2018 | HotpotQA | KBQA and KGQA | Link |
7 | CR-LT-KGQA: A Knowledge Graph Question Answering Dataset Requiring Commonsense Reasoning and Long-Tail Knowledge | arXiv | 2024 | CR-LT-KGQA | KBQA and KGQA | Link |
8 | CPAT-Questions: A Self-Updating Benchmark for Present-Anchored Temporal Question-Answering | ACL Findings | 2024 | TemporalQA | KBQA and KGQA | Link |
9 | SituatedQA: Incorporating Extra-Linguistic Contexts into QA | EMNLP | 2024 | SituatedQA | Open-retrieval QA | Link |
10 | CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge | NAACL | 2024 | CommonsenseQA | Multiple-choice QA | Link |
11 | FanOutQA: A Multi-Hop, Multi-Document Question Answering Benchmark for Large Language Models | ACL | 2024 | FanOutQA | Multi-hop QA | Link |
12 | MINTQA: A Multi-Hop Question Answering Benchmark for Evaluating LLMs on New and Tail Knowledge | arXiv | 2024 | MINTQA | Multi-hop QA | Link |
13 | What Disease Does This Patient Have? A Large-Scale Open Domain Question Answering Dataset from Medical Exams | Applied Sciences | 2021 | MedQA | Multiple-choice QA | Link |
14 | PAT-Questions: A Self-Updating Benchmark for Present-Anchored Temporal Question-Answering | ACL Findings | 2024 | PAQA | Temporal QA | Link |
15 | MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models | ACL Findings | 2023 | MenatQA | Temporal QA | Link |
16 | TempTabQA: Temporal Question Answering for Semi-Structured Tables | EMNLP | 2023 | TempTabQA | Temporal QA | Link |
17 | Complex Temporal Question Answering on Knowledge Graphs | CIKM | 2021 | EXAQT | Temporal QA | Link |
18 | Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA | EMNLP | 2024 | Loong | Multi-doc QA | Link |
19 | MRAMG-Bench: A BeyondText Benchmark for Multimodal Retrieval-Augmented Multimodal Generation | arXiv | 2025 | MRAMG | Multi-modal QA | Link |
20 | OMG-QA: Building Open-Domain Multi-Modal Generative Question Answering Systems | EMNLP | 2024 | OMG-QA | Multi-domain Multilingual QA | Link |
21 | WebFAQ: A Multilingual Collection of Natural Q&A Datasets for Dense Retrieval | arXiv | 2025 | WebFAQ | Multi-domain Multilingual QA | Link |
22 | M2QA: Multi-domain Multilingual Question Answering | EMNLP | 2024 | M2QA | Multi-modal QA | Link |
23 | M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models | ACL Findings | 2024 | M3SciQA | Multi-modal QA | Link |
24 | A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model's Accuracy for Question Answering on Enterprise SQL Databases | GRADES-NDA | 2024 | ChatData | LLM and KGs for QA | Link |
25 | XplainLLM: A Knowledge-Augmented Dataset for Reliable Grounded Explanations in LLMs | EMNLP | 2024 | XplainLLM | LLM and KGs for QA | Link |
26 | Developing a Scalable Benchmark for Assessing Large Language Models in Knowledge Graph Engineering | SEMANTICS | 2023 | LLM-KG-Bench | LLM and KGs for QA | Link |
27 | Docugami Knowledge Graph Retrieval Augmented Generation (KG-RAG) Datasets | - | 2023 | KG-RAG | LLM and KGs for QA | - |
28 | How Credible Is an Answer From Retrieval-Augmented LLMs? Investigation and Evaluation With Multi-Hop QA | ACL ARR | 2024 | - | LLM and KGs for QA | Link |
29 | Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study over Open-ended Question Answering | arXiv | 2024 | OKGQA | LLM and KGs for QA | Link |
30 | MiniRAG: Towards Extremely Simple Retrieval-Augmented Generation | arXiv | 2025 | LiHua-World | LLM and KGs for QA | Link |
31 | Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering | NeurIPS Dataset and Benchmarks Track | 2022 | ScienceQA | LLM and KGs for QA | Link |
32 | STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases | NeurIPS Dataset and Benchmarks Track | 2024 | STaRK | LLM and KGs for QA | Link |
33 | mmRAG: A Modular Benchmark for Retrieval-Augmented Generation over Text, Tables, and Knowledge Graphs | arXiv | 2025 | mmRAG | LLM and KGs for QA | Link |
34 | LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs -- No Silver Bullet for LC or RAG Routing | arXiv | 2025 | LaRA | LLM and KGs for QA | Link |
35 | KGQAGen: Diagnosing and Addressing Pitfalls in KG-RAG Datasets, toward More Reliable Benchmarking | - | 2025 | KGQAGen | LLM and KGs for QA | - |
36 | AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning | KDD | 2025 | BlendQA | LLM and KGs for QA | - |
NO | Title | Venue | Year | Github | Category | Paper Link |
---|---|---|---|---|---|---|
1 | KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation | arXiv | 2024 | KAG | LLM and KGs for QA | Link |
2 | Fact Finder -- Enhancing Domain Expertise of Large Language Models by Incorporating Knowledge Graphs | arXiv | 2024 | Fact Finder | LLM and KGs for QA | Link |
3 | Leveraging Large Language Models and Knowledge Graphs for Advanced Biomedical Question Answering Systems | CSA 2024 | 2024 | Cypher Translator | LLM and KGs for QA | Link |
4 | A Prompt Engineering Approach and a Knowledge Graph based Framework for Tackling Legal Implications of Large Language Model Answers | arXiv | 2024 | - | LLM and KGs for QA | Link |
5 | Ontology-Aware RAG for Improved Question-Answering in Cybersecurity Education | arXiv | 2024 | - | LLM and KGs for QA | Link |
6 | Knowledge Graphs as a source of trust for LLM-powered enterprise question answering | Journal of Web Semantics | 2025 | - | LLM and KGs for QA | Link |
7 | MedRAG: Enhancing Retrieval-augmented Generation with Knowledge Graph-Elicited Reasoning for Healthcare Copilot | WWW | 2025 | MedRAG | LLM and KGs for QA | Link |
8 | EICopilot: Search and Explore Enterprise Information over Large-scale Knowledge Graphs with LLM-driven Agents | arXiv | 2025 | - | LLM and KGs for QA | Link |
9 | Nanjing Yunjin intelligent question-answering system based on knowledge graphs and retrieval augmented generation technology | Heritage Science | 2024 | - | LLM and KGs for QA | Link |
10 | A Joint LLM-KG System for Disease Q&A | IEEE JBHI | 2025 | - | LLM and KGs for QA | Link |
NO | Name | Description | Source | Github |
---|---|---|---|---|
1 | GraphRAG-QA | An industrial demo of GraphRAG integrating several query engine for augmenting QA, NLP2Cypher-based KG query engine, vector RAG query engine, and Graph vector RAG query engine. | NebulaGraph | GraphRAG-QA |
2 | Neo4jRAG-QA | This sample application demonstrates how to implement a Large Language Model (LLM) and Retrieval Augmented Generation (RAG) system with a Neo4j Graph Database. | Neo4j Graph | Neo4j Graph RAG |
3 | BioGraphRAG | This a platform to integrate biomedical knowledge graphs stored in NebulaGraph with LLMs via GraphRAG architecture. | BioGraphRAG | |
4 | kotaemon | An open-source clean & customizable RAG UI for chatting with your documents. Built with both end users and developers in mind. | Cinnamon AI | kotaemon |
5 | PIKE-RAG | A secIalized KnowledgE and Rationale Augmented Generation, which focuses on extracting, understanding, and applying domain-specific knowledge to gradually guide LLMs toward accurate responses. | Microsoft | PIKE-RAG |
6 | AprèsCoT | AprèsCoT: Explaining LLM Answers with Knowledge Graphs and Chain of Thought. | EDBT25 Demo | AprèsCoT |
NO | Title | Venue | Year | Paper Link |
---|---|---|---|---|
1 | Unifying Large Language Models and Knowledge Graphs: A Roadmap | TKDE | 2024 | Link |
2 | Graph Retrieval-Augmented Generation: A Survey | arXiv | 2024 | Link |
3 | Retrieval-Augmented Generation with Graphs (GraphRAG) | arXiv | 2024 | Link |
4 | Multilingual Question Answering Systems for Knowledge Graphs—A Survey | Semantic Web | 2024 | Link |
5 | Temporal Knowledge Graph Question Answering: A Survey | arXiv | 2024 | Link |
6 | Knowledge Graph and Large Language Model Co-learning via Structure-oriented Retrieval Augmented Generation | Data Engineering Bulletin | 2024 | Link |
7 | Research Trends for the Interplay between Large Language Models and Knowledge Graphs | LLM+KG@VLDB2024 | 2024 | Link |
8 | Neural-Symbolic Reasoning over Knowledge Graphs: A Survey from a Query Perspective | arXiv | 2024 | Link |
9 | Large Language Models, Knowledge Graphs and Search Engines: A Crossroads for Answering Users' Questions | arXiv | 2025 | Link |
10 | Knowledge Graphs, Large Language Models, and Hallucinations: An NLP Perspective | Journal of Web Semantics | 2025 | Link |
11 | A Survey of Graph Retrieval-Augmented Generation for Customized Large Language Models | arXiv | 2025 | Link |
12 | Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG | arXiv | 2025 | Link |
13 | A survey on augmenting knowledge graphs (KGs) with large language models (LLMs): models, evaluation metrics, benchmarks, and challenges | Discover Artificial Intelligence | 2024 | Link |
14 | Unifying Large Language Models and Knowledge Graphs for efficient Regulatory Information Retrieval and Answer Generation | REgNLP Workshop | 2025 | Link |
15 | A Comprehensive Survey of Knowledge-Based Vision Question Answering Systems: The Lifecycle of Knowledge in Visual Reasoning Task | arXiv | 2025 | Link |
16 | Knowledge Conflicts for LLMs: A Survey | EMNLP | 2024 | Link |
17 | A comprehensive survey on integrating large language models with knowledge-based methods | Knowledge-Based Systems | 2025 | Link |
18 | Synergizing RAG and Reasoning: A Systematic Review | arXiv | 2025 | Link |
19 | A Survey of Multimodal Retrieval-Augmented Generation | arXiv | 2025 | Link |
20 | Retrieval-Augmented Generation with Knowledge Graphs: A Survey | OpenReview | 2025 | Link |
21 | Complex QA and language models hybrid architectures, Survey | arXiv | 2023 | Link |
22 | Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive Survey | arXiv | 2025 | Link |
23 | Graph-Augmented Large Language Model Agents: Current Progress and Future Prospects | arXiv | 2025 | Link |
24 | A Survey on Enhancing Large Language Models with Symbolic Reasoning | OpenReview | 2025 | Link |
25 | Trustworthy Medical Question Answering: An Evaluation-Centric Survey | arXiv | 2025 | Link |
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for LLM-KG4QA
Similar Open Source Tools

LLM-KG4QA
LLM-KG4QA is a repository focused on the integration of Large Language Models (LLMs) and Knowledge Graphs (KGs) for Question Answering (QA). It covers various aspects such as using KGs as background knowledge, reasoning guideline, and refiner/filter. The repository provides detailed information on pre-training, fine-tuning, and Retrieval Augmented Generation (RAG) techniques for enhancing QA performance. It also explores complex QA tasks like Explainable QA, Multi-Modal QA, Multi-Document QA, Multi-Hop QA, Multi-run and Conversational QA, Temporal QA, Multi-domain and Multilingual QA, along with advanced topics like Optimization and Data Management. Additionally, it includes benchmark datasets, industrial and scientific applications, demos, and related surveys in the field.

Awesome-Resource-Efficient-LLM-Papers
A curated list of high-quality papers on resource-efficient Large Language Models (LLMs) with a focus on various aspects such as architecture design, pre-training, fine-tuning, inference, system design, and evaluation metrics. The repository covers topics like efficient transformer architectures, non-transformer architectures, memory efficiency, data efficiency, model compression, dynamic acceleration, deployment optimization, support infrastructure, and other related systems. It also provides detailed information on computation metrics, memory metrics, energy metrics, financial cost metrics, network communication metrics, and other metrics relevant to resource-efficient LLMs. The repository includes benchmarks for evaluating the efficiency of NLP models and references for further reading.

AudioLLM
AudioLLMs is a curated collection of research papers focusing on developing, implementing, and evaluating language models for audio data. The repository aims to provide researchers and practitioners with a comprehensive resource to explore the latest advancements in AudioLLMs. It includes models for speech interaction, speech recognition, speech translation, audio generation, and more. Additionally, it covers methodologies like multitask audioLLMs and segment-level Q-Former, as well as evaluation benchmarks like AudioBench and AIR-Bench. Adversarial attacks such as VoiceJailbreak are also discussed.

Awesome-Agent-Papers
This repository is a comprehensive collection of research papers on Large Language Model (LLM) agents, organized across key categories including agent construction, collaboration mechanisms, evolution, tools, security, benchmarks, and applications. The taxonomy provides a structured framework for understanding the field of LLM agents, bridging fragmented research threads by highlighting connections between agent design principles and emergent behaviors.

LLM4EC
LLM4EC is an interdisciplinary research repository focusing on the intersection of Large Language Models (LLM) and Evolutionary Computation (EC). It provides a comprehensive collection of papers and resources exploring various applications, enhancements, and synergies between LLM and EC. The repository covers topics such as LLM-assisted optimization, EA-based LLM architecture search, and applications in code generation, software engineering, neural architecture search, and other generative tasks. The goal is to facilitate research and development in leveraging LLM and EC for innovative solutions in diverse domains.

Github-Ranking-AI
This repository provides a list of the most starred and forked repositories on GitHub. It is updated automatically and includes information such as the project name, number of stars, number of forks, language, number of open issues, description, and last commit date. The repository is divided into two sections: LLM and chatGPT. The LLM section includes repositories related to large language models, while the chatGPT section includes repositories related to the chatGPT chatbot.

Cool-GenAI-Fashion-Papers
Cool-GenAI-Fashion-Papers is a curated list of resources related to GenAI-Fashion, including papers, workshops, companies, and products. It covers a wide range of topics such as fashion design synthesis, outfit recommendation, fashion knowledge extraction, trend analysis, and more. The repository provides valuable insights and resources for researchers, industry professionals, and enthusiasts interested in the intersection of AI and fashion.

Awesome_LLM_System-PaperList
Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on LLMs inference and serving.

Awesome-LLM-3D
This repository is a curated list of papers related to 3D tasks empowered by Large Language Models (LLMs). It covers tasks such as 3D understanding, reasoning, generation, and embodied agents. The repository also includes other Foundation Models like CLIP and SAM to provide a comprehensive view of the area. It is actively maintained and updated to showcase the latest advances in the field. Users can find a variety of research papers and projects related to 3D tasks and LLMs in this repository.

open-llms
Open LLMs is a repository containing various Large Language Models licensed for commercial use. It includes models like T5, GPT-NeoX, UL2, Bloom, Cerebras-GPT, Pythia, Dolly, and more. These models are designed for tasks such as transfer learning, language understanding, chatbot development, code generation, and more. The repository provides information on release dates, checkpoints, papers/blogs, parameters, context length, and licenses for each model. Contributions to the repository are welcome, and it serves as a resource for exploring the capabilities of different language models.

speech-trident
Speech Trident is a repository focusing on speech/audio large language models, covering representation learning, neural codec, and language models. It explores speech representation models, speech neural codec models, and speech large language models. The repository includes contributions from various researchers and provides a comprehensive list of speech/audio language models, representation models, and codec models.

ai-game-development-tools
Here we will keep track of the AI Game Development Tools, including LLM, Agent, Code, Writer, Image, Texture, Shader, 3D Model, Animation, Video, Audio, Music, Singing Voice and Analytics. 🔥 * Tool (AI LLM) * Game (Agent) * Code * Framework * Writer * Image * Texture * Shader * 3D Model * Avatar * Animation * Video * Audio * Music * Singing Voice * Speech * Analytics * Video Tool

awesome-llm-planning-reasoning
The 'Awesome LLMs Planning Reasoning' repository is a curated collection focusing on exploring the capabilities of Large Language Models (LLMs) in planning and reasoning tasks. It includes research papers, code repositories, and benchmarks that delve into innovative techniques, reasoning limitations, and standardized evaluations related to LLMs' performance in complex cognitive tasks. The repository serves as a comprehensive resource for researchers, developers, and enthusiasts interested in understanding the advancements and challenges in leveraging LLMs for planning and reasoning in real-world scenarios.
For similar tasks

Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.

sorrentum
Sorrentum is an open-source project that aims to combine open-source development, startups, and brilliant students to build machine learning, AI, and Web3 / DeFi protocols geared towards finance and economics. The project provides opportunities for internships, research assistantships, and development grants, as well as the chance to work on cutting-edge problems, learn about startups, write academic papers, and get internships and full-time positions at companies working on Sorrentum applications.

tidb
TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.

zep-python
Zep is an open-source platform for building and deploying large language model (LLM) applications. It provides a suite of tools and services that make it easy to integrate LLMs into your applications, including chat history memory, embedding, vector search, and data enrichment. Zep is designed to be scalable, reliable, and easy to use, making it a great choice for developers who want to build LLM-powered applications quickly and easily.

telemetry-airflow
This repository codifies the Airflow cluster that is deployed at workflow.telemetry.mozilla.org (behind SSO) and commonly referred to as "WTMO" or simply "Airflow". Some links relevant to users and developers of WTMO: * The `dags` directory in this repository contains some custom DAG definitions * Many of the DAGs registered with WTMO don't live in this repository, but are instead generated from ETL task definitions in bigquery-etl * The Data SRE team maintains a WTMO Developer Guide (behind SSO)

mojo
Mojo is a new programming language that bridges the gap between research and production by combining Python syntax and ecosystem with systems programming and metaprogramming features. Mojo is still young, but it is designed to become a superset of Python over time.

pandas-ai
PandasAI is a Python library that makes it easy to ask questions to your data in natural language. It helps you to explore, clean, and analyze your data using generative AI.

databend
Databend is an open-source cloud data warehouse that serves as a cost-effective alternative to Snowflake. With its focus on fast query execution and data ingestion, it's designed for complex analysis of the world's largest datasets.
For similar jobs

weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.