Awesome-LLM4Graph-Papers

[KDD'2024] "LLM4Graph: A Survey of Large Language Models for Graphs"

Stars: 290

Visit

A collection of papers and resources about Large Language Models (LLM) for Graph Learning (Graph). Integrating LLMs with graph learning techniques to enhance performance in graph learning tasks. Categorizes approaches based on four primary paradigms and nine secondary-level categories. Valuable for research or practice in self-supervised learning for recommendation systems.

README:

Awesome-LLM4Graph-Papers

A collection of papers and resources about Large Language Models (LLM) for Graph Learning (Graph).

Graphs are an essential data structure utilized to represent relationships in real-world scenarios. Prior research has established that Graph Neural Networks (GNNs) deliver impressive outcomes in graph-centric tasks, such as link prediction and node classification. Despite these advancements, challenges like data sparsity and limited generalization capabilities continue to persist. Recently, Large Language Models (LLMs) have gained attention in natural language processing. They excel in language comprehension and summarization. Integrating LLMs with graph learning techniques has attracted interest as a way to enhance performance in graph learning tasks.

News

🤗 We're actively working on this project, and your interest is greatly appreciated! To keep up with the latest developments, please consider hit the STAR and WATCH for updates.

🚀 Our LLM4Graph Survey is accepted by KDD 2024, and we will also give a lecture-style tutorial there!
🔥 We gave a tutorial on LLM4Graph at TheWebConf (WWW) 2024!
Our survey paper: A Survey of Large Language Models for Graphs is now ready.

Overview

This repository serves as a collection of recent advancements in employing large language models (LLMs) for modeling graph-structured data. We categorize and summarize the approaches based on four primary paradigms and nine secondary-level categories. The four primary categories include: 1) GNNs as Prefix, 2) LLMs as Prefix, 3) LLMs-Graphs Intergration, and 4) LLMs-Only

GNNs as Prefix

LLMs as Prefix

LLMs-Graphs Intergration

LLMs-Only

We hope this repository proves valuable to your research or practice in the field of self-supervised learning for recommendation systems. If you find it helpful, please consider citing our work:

@inproceedings{ren2024survey,
  title={A survey of large language models for graphs},
  author={Ren, Xubin and Tang, Jiabin and Yin, Dawei and Chawla, Nitesh and Huang, Chao},
  booktitle={Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
  pages={6616--6626},
  year={2024}
}

@inproceedings{huang2024large,
  title={Large Language Models for Graphs: Progresses and Directions},
  author={Huang, Chao and Ren, Xubin and Tang, Jiabin and Yin, Dawei and Chawla, Nitesh},
  booktitle={Companion Proceedings of the ACM on Web Conference 2024},
  pages={1284--1287},
  year={2024}
}

Awesome-LLM4Graph-Papers

Related Resources

[TKDE'2024] Large language models on graphs: A comprehensive survey [paper]
[IJCAI'2024] A Survey of Graph Meets Large Language Model: Progress and Future Directions [paper]
[NeurIPS'2024] TEG-DB: A Comprehensive Dataset and Benchmark of Textual-Edge Graphs [paper]

🌐 GNNs as Prefix

Node-level Tokenization

(SIGIR'2024) GraphGPT: Graph instruction tuning for large language models [paper]
(arxiv'2024) HiGPT: Heterogeneous Graph Language Model [paper]
(WWW'2024) GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks [paper]
(arxiv'2024) UniGraph: Learning a Cross-Domain Graph Foundation Model From Natural Language [paper]
(NeurIPS'2024) GIMLET:Aunifiedgraph-textmodelforinstruction-based molecule zero-shot learning [paper]
(arxiv'2024) XRec: Large Language Models for Explainable Recommendation [paper]

Graph-level

(arxiv'2023) GraphLLM: Boosting graph reasoning ability of large language model [paper]
(Computers in Biology and Medicine) GIT-Mol: A multi-modal large language model for molecular science with graph, image, and text [paper]
(EMNLP'2023) MolCA: Molecular graph-language modeling with cross- modal projector and uni-modal adapter [paper]
(arxiv'2023) InstructMol: Multi-modal integration for building a versatile and reliable molecular assistant in drug discovery [paper]
(arxiv'2024) G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering [paper]
(AAAI'2024) Graph neural prompting with large language models [paper]

🌐 LLMs as Prefix

Embs. from LLMs for GNNs

(arxiv'2023) Prompt-based node feature extractor for few-shot learning on text-attributed graphs [paper]
(arxiv'2023) SimTeG: A frustratingly simple approach improves textual graph learning [paper]
(KDD'2023) Graph-aware language model pre-training on a large graph corpus can help multiple graph applications [paper]
(ICLR'2024) One for all: Towards training one graph model for all classification tasks [paper]
(ICLR'2024) Harnessing explanations: Llm-to-lm interpreter for enhanced text-attributed graph representation learning [paper]
(WSDM'2024) LLMRec: Large language models with graph augmentation for recommendation [paper]

Labels from LLMs for GNNs

(arxiv'2024) OpenGraph: Towards Open Graph Foundation Models [paper]
(arxiv'2023) Label-free node classification on graphs with large language models (LLMs) [paper]
(arxiv'2024) GraphEdit: Large Language Models for Graph Structure Learning [paper]
(WWW'2024) Representation learning with large language models for recommendation [paper]

🌐 LLMs-Graphs Intergration

Alignment between GNNs and LLMs

(arxiv'2022) A molecular multimodal foundation model associating molecule graphs with natural language [paper]
(arxiv'2023) ConGraT: Self-supervised contrastive pretraining for joint graph and text embeddings [paper]
(arxiv'2023) Prompt tuning on graph-augmented low-resource text classification [paper]
(arxiv'2023) GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs [paper]
(Nature Machine Intelligence'2023) Multi-modal molecule structure–text model for text-based retrieval and editing [paper]
(arxiv'2023) Pretraining language models with text-attributed heterogeneous graphs [paper]
(arxiv'2022) Learning on large-scale text-attributed graphs via variational inference [paper]

Fusion Training of GNNs and LLMs

(ICLR'2022) GreaseLM: Graph reasoning enhanced language models for question answering [paper]
(arxiv'2023) Disentangled representation learning with large language models for text-attributed graphs [paper]
(arxiv'2024) Efficient Tuning and Inference for Large Language Models on Textual Graphs [paper]
(WWW'2024) Can GNN be Good Adapter for LLMs? [paper]

LLMs Agent for Graphs

(ACL'2023) Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments[paper]
(arxiv'2022) Graph Agent: Explicit Reasoning Agent for Graphs [paper]
(arxiv'2024) Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments [paper]
(arxiv'2023) Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments [paper]
(ICLR'2024) Reasoning on graphs: Faithful and interpretable large language model reasoning [paper]

🌐 LLMs-Only (* indicates that VLMs are utilized)

Tuning-free

(NeurIPS'2024) Can language models solve graph problems in natural language? [paper]
(arxiv'2023) GPT4Graph: Can large language models understand graph structured data? an empirical evaluation and benchmarking [paper]
(arxiv'2023) BeyondText:ADeepDiveinto Large Language Models’ Ability on Understanding Graph Data [paper]
(KDD'2024) Exploring the potential of large language models (llms) in learning on graphs [paper]
(arxiv'2023) Graphtext: Graph reasoning in text space [paper]
(arxiv'2023) Talk like a graph: Encoding graphs for large language models [paper]
(arxiv'2023) LLM4DyG:Can Large Language Models Solve Problems on Dynamic Graphs? [paper]
(arxiv'2023) Which Modality should I use–Text, Motif, or Image?: Understanding Graphs with Large Language Models [paper]
(arxiv'2023) When Graph Data Meets Multimodal: A New Paradigm for Graph Understanding and Reasoning [paper]

Tuning-required

(arxiv'2023) Natural language is all a graph needs [paper]
(NeurIPS'2024) Walklm:A uniform language model fine-tuning framework for attributed graph embedding [paper]
(NeurIPS'2024) *GITA: Graph to Visual and Textual Integration for Vision-Language Graph Reasoning [paper]
(arxiv'2024) LLaGA: Large Language and Graph Assistant [paper]
(arxiv'2024) InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment [paper]
(arxiv'2024) ZeroG: Investigating Cross-dataset Zero-shot Transferability in Graphs [paper]
(arxiv'2024) GraphWiz: An Instruction-Following Language Model for Graph Problems [paper]
(arxiv'2024) GraphInstruct: Empowering Large Language Models with Graph Understanding and Reasoning Capability [paper]
(arxiv'2024) MuseGraph: Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining [paper]

Contributing

If you have come across relevant resources, feel free to submit a pull request.

- (Journal/Confernce'20XX) **paper_name** [[paper](link)]

To add a paper to the survey, please consider providing more detailed information in the PR 😊

GNNs as Prefix
  - (Node-level Tokenization / Graph-level Tokenization)
LLMs as Prefix
  - (Embs. from LLMs for GNNs / Labels from LLMs for GNNs)
LLMs-Graphs Intergration
  - (Alignment between GNNs and LLMs / Fusion Training of GNNs and LLMs / LLMs Agent for Graphs)
LLMs-Only
  - (Tuning-free / Tuning-required)
Please also consider providing a brief introduction about the method to help us quickly add the paper to our survey :)

Acknowledgements

The design of our README.md is inspired by Awesome-LLM-KG and Awesome-LLMs-in-Graph-tasks, thanks to their works!

For Tasks:

Click tags to check more tools for each tasks

analyze graphs classify nodes predict links recommend items summarize text

For Jobs:

data scientist machine learning engineer research scientist ai researcher data analyst

Alternative AI tools for Awesome-LLM4Graph-Papers

Similar Open Source Tools

Awesome-LLM4Graph-Papers

github

: 290

Awesome-RL-based-LLM-Reasoning

This repository is dedicated to enhancing Language Model (LLM) reasoning with reinforcement learning (RL). It includes a collection of the latest papers, slides, and materials related to RL-based LLM reasoning, aiming to facilitate quick learning and understanding in this field. Starring this repository allows users to stay updated and engaged with the forefront of RL-based LLM reasoning.

github

: 380

Awesome-Text2SQL

Awesome Text2SQL is a curated repository containing tutorials and resources for Large Language Models, Text2SQL, Text2DSL, Text2API, Text2Vis, and more. It provides guidelines on converting natural language questions into structured SQL queries, with a focus on NL2SQL. The repository includes information on various models, datasets, evaluation metrics, fine-tuning methods, libraries, and practice projects related to Text2SQL. It serves as a comprehensive resource for individuals interested in working with Text2SQL and related technologies.

github

: 1.5k

Awesome-explainable-AI

This repository contains frontier research on explainable AI (XAI), a hot topic in the field of artificial intelligence. It includes trends, use cases, survey papers, books, open courses, papers, and Python libraries related to XAI. The repository aims to organize and categorize publications on XAI, provide evaluation methods, and list various Python libraries for explainable AI.

github

: 1.3k

awesome-mcp-servers

A curated list of awesome Model Context Protocol (MCP) servers that enable AI models to securely interact with local and remote resources through standardized server implementations. The list focuses on production-ready and experimental servers extending AI capabilities through file access, database connections, API integrations, and other contextual services.

github

: 1.6k

kan-gpt

The KAN-GPT repository is a PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling. It provides a model for generating text based on prompts, with a focus on improving performance compared to traditional MLP-GPT models. The repository includes scripts for training the model, downloading datasets, and evaluating model performance. Development tasks include integrating with other libraries, testing, and documentation.

github

: 663

sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with LLMs faster and more controllable by co-designing the frontend language and the runtime system. The core features of SGLang include: - **A Flexible Front-End Language**: This allows for easy programming of LLM applications with multiple chained generation calls, advanced prompting techniques, control flow, multiple modalities, parallelism, and external interaction. - **A High-Performance Runtime with RadixAttention**: This feature significantly accelerates the execution of complex LLM programs by automatic KV cache reuse across multiple calls. It also supports other common techniques like continuous batching and tensor parallelism.

github

: 12.9k

dom-to-semantic-markdown

DOM to Semantic Markdown is a tool that converts HTML DOM to Semantic Markdown for use in Large Language Models (LLMs). It maximizes semantic information, token efficiency, and preserves metadata to enhance LLMs' processing capabilities. The tool captures rich web content structure, including semantic tags, image metadata, table structures, and link destinations. It offers customizable conversion options and supports both browser and Node.js environments.

github

: 708

awesome-cuda-and-hpc

github

: 221

awesome-cuda-triton-hpc

github

: 211

awesome-production-llm

This repository is a curated list of open-source libraries for production large language models. It includes tools for data preprocessing, training/finetuning, evaluation/benchmarking, serving/inference, application/RAG, testing/monitoring, and guardrails/security. The repository also provides a new category called LLM Cookbook/Examples for showcasing examples and guides on using various LLM APIs.

github

: 408

awesome-cuda-triton-tvm-hpc

github

: 169

awesome-cuda-triton-mlir-hpc

github

: 169

BadukMegapack

BadukMegapack is an installer for various AI Baduk (Go) programs, designed for baduk players who want to easily access and use a variety of baduk AI programs without complex installations. The megapack includes popular programs like Lizzie, KaTrain, Sabaki, KataGo, LeelaZero, and more, along with weight files for different AI models. Users can update their graphics card drivers before installation for optimal performance.

github

: 194

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLM, developed by the MMRazor and MMDeploy teams. It has the following core features: * **Efficient Inference** : LMDeploy delivers up to 1.8x higher request throughput than vLLM, by introducing key features like persistent batch(a.k.a. continuous batching), blocked KV cache, dynamic split&fuse, tensor parallelism, high-performance CUDA kernels and so on. * **Effective Quantization** : LMDeploy supports weight-only and k/v quantization, and the 4-bit inference performance is 2.4x higher than FP16. The quantization quality has been confirmed via OpenCompass evaluation. * **Effortless Distribution Server** : Leveraging the request distribution service, LMDeploy facilitates an easy and efficient deployment of multi-model services across multiple machines and cards. * **Interactive Inference Mode** : By caching the k/v of attention during multi-round dialogue processes, the engine remembers dialogue history, thus avoiding repetitive processing of historical sessions.

github

: 6.0k

Torch-Pruning

Torch-Pruning (TP) is a library for structural pruning that enables pruning for a wide range of deep neural networks. It uses an algorithm called DepGraph to physically remove parameters. The library supports pruning off-the-shelf models from various frameworks and provides benchmarks for reproducing results. It offers high-level pruners, dependency graph for automatic pruning, low-level pruning functions, and supports various importance criteria and modules. Torch-Pruning is compatible with both PyTorch 1.x and 2.x versions.

github

: 2.6k

For similar tasks

Awesome-LLM4Graph-Papers

github

: 290

Graph-CoT

This repository contains the source code and datasets for Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs accepted to ACL 2024. It proposes a framework called Graph Chain-of-thought (Graph-CoT) to enable Language Models to traverse graphs step-by-step for reasoning, interaction, and execution. The motivation is to alleviate hallucination issues in Language Models by augmenting them with structured knowledge sources represented as graphs.

github

: 174

Awesome-Graph-LLM

Awesome-Graph-LLM is a curated collection of research papers exploring the intersection of graph-based techniques with Large Language Models (LLMs). The repository aims to bridge the gap between LLMs and graph structures prevalent in real-world applications by providing a comprehensive list of papers covering various aspects of graph reasoning, node classification, graph classification/regression, knowledge graphs, multimodal models, applications, and tools. It serves as a valuable resource for researchers and practitioners interested in leveraging LLMs for graph-related tasks.

github

: 2.0k

Awesome-LLM4RS-Papers

This paper list is about Large Language Model-enhanced Recommender System. It also contains some related works. Keywords: recommendation system, large language models

github

: 480

ai_projects

This repository contains a collection of AI projects covering various areas of machine learning. Each project is accompanied by detailed articles on the associated blog sciblog. Projects range from introductory topics like Convolutional Neural Networks and Transfer Learning to advanced topics like Fraud Detection and Recommendation Systems. The repository also includes tutorials on data generation, distributed training, natural language processing, and time series forecasting. Additionally, it features visualization projects such as football match visualization using Datashader.

github

: 790

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

onnxruntime-genai

ONNX Runtime Generative AI is a library that provides the generative AI loop for ONNX models, including inference with ONNX Runtime, logits processing, search and sampling, and KV cache management. Users can call a high level `generate()` method, or run each iteration of the model in a loop. It supports greedy/beam search and TopP, TopK sampling to generate token sequences, has built in logits processing like repetition penalties, and allows for easy custom scoring.

github

: 442

khoj

Khoj is an open-source, personal AI assistant that extends your capabilities by creating always-available AI agents. You can share your notes and documents to extend your digital brain, and your AI agents have access to the internet, allowing you to incorporate real-time information. Khoj is accessible on Desktop, Emacs, Obsidian, Web, and Whatsapp, and you can share PDF, markdown, org-mode, notion files, and GitHub repositories. You'll get fast, accurate semantic search on top of your docs, and your agents can create deeply personal images and understand your speech. Khoj is self-hostable and always will be.

github

: 28.5k

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 855

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.3k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 30.6k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675

Awesome-LLM4Graph-Papers

README:

Awesome-LLM4Graph-Papers

News

Overview

Table of Contents

Related Resources

🌐 GNNs as Prefix

Node-level Tokenization

Graph-level

🌐 LLMs as Prefix

Embs. from LLMs for GNNs

Labels from LLMs for GNNs

🌐 LLMs-Graphs Intergration

Alignment between GNNs and LLMs

Fusion Training of GNNs and LLMs

LLMs Agent for Graphs

🌐 LLMs-Only (* indicates that VLMs are utilized)

Tuning-free

Tuning-required

Contributing

Acknowledgements

For Tasks:

For Jobs:

Alternative AI tools for Awesome-LLM4Graph-Papers

Similar Open Source Tools

Awesome-LLM4Graph-Papers

Awesome-RL-based-LLM-Reasoning

Awesome-Text2SQL

Awesome-explainable-AI

awesome-mcp-servers

kan-gpt

sglang

dom-to-semantic-markdown

awesome-cuda-and-hpc

awesome-cuda-triton-hpc

awesome-production-llm

awesome-cuda-triton-tvm-hpc

awesome-cuda-triton-mlir-hpc

BadukMegapack

lmdeploy

Torch-Pruning

For similar tasks

Awesome-LLM4Graph-Papers

Graph-CoT

Awesome-Graph-LLM

Awesome-LLM4RS-Papers

ai_projects

ai-guide

onnxruntime-genai

khoj

For similar jobs

weave

LLMStack

VisionCraft

kaito

PyRIT

tabby

spear

Magick