Awesome-LLM4Graph-Papers
[KDD'2024 Survey+Tutorial] "LLM4Graph: A Survey of Large Language Models for Graphs"
Stars: 84
A collection of papers and resources about Large Language Models (LLM) for Graph Learning (Graph). Integrating LLMs with graph learning techniques to enhance performance in graph learning tasks. Categorizes approaches based on four primary paradigms and nine secondary-level categories. Valuable for research or practice in self-supervised learning for recommendation systems.
README:
A collection of papers and resources about Large Language Models (LLM) for Graph Learning (Graph).
Graphs are an essential data structure utilized to represent relationships in real-world scenarios. Prior research has established that Graph Neural Networks (GNNs) deliver impressive outcomes in graph-centric tasks, such as link prediction and node classification. Despite these advancements, challenges like data sparsity and limited generalization capabilities continue to persist. Recently, Large Language Models (LLMs) have gained attention in natural language processing. They excel in language comprehension and summarization. Integrating LLMs with graph learning techniques has attracted interest as a way to enhance performance in graph learning tasks.
π€ We're actively working on this project, and your interest is greatly appreciated! To keep up with the latest developments, please consider hit the STAR and WATCH for updates.
-
π Our LLM4Graph Survey is accepted by KDD 2024, and we will also give a lecture-style tutorial there!
-
π₯ We gave a tutorial on LLM4Graph at TheWebConf (WWW) 2024!
-
Our survey paper: A Survey of Large Language Models for Graphs is now ready.
This repository serves as a collection of recent advancements in employing large language models (LLMs) for modeling graph-structured data. We categorize and summarize the approaches based on four primary paradigms and nine secondary-level categories. The four primary categories include: 1) GNNs as Prefix, 2) LLMs as Prefix, 3) LLMs-Graphs Intergration, and 4) LLMs-Only
- GNNs as Prefix
- LLMs as Prefix
- LLMs-Graphs Intergration
- LLMs-Only
We hope this repository proves valuable to your research or practice in the field of self-supervised learning for recommendation systems. If you find it helpful, please consider citing our work:
@article{ren2024survey,
title={A Survey of Large Language Models for Graphs},
author={Ren, Xubin and Tang, Jiabin and Yin, Dawei and Chawla, Nitesh and Huang, Chao},
journal={arXiv preprint arXiv:2405.08011},
year={2024}
}
@inproceedings{huang2024large,
title={Large Language Models for Graphs: Progresses and Directions},
author={Huang, Chao and Ren, Xubin and Tang, Jiabin and Yin, Dawei and Chawla, Nitesh},
booktitle={Companion Proceedings of the ACM on Web Conference 2024},
pages={1284--1287},
year={2024}
}
- Awesome-LLM4Graph-Papers
- Large language models on graphs: A comprehensive survey [paper]
- A Survey of Graph Meets Large Language Model: Progress and Future Directions [paper]
- (SIGIR'2024) GraphGPT: Graph instruction tuning for large language models [paper]
- (arxiv'2024) HiGPT: Heterogeneous Graph Language Model [paper]
- (WWW'2024) GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks [paper]
- (arxiv'2024) UniGraph: Learning a Cross-Domain Graph Foundation Model From Natural Language [paper]
- (NeurIPS'2024) GIMLET:Aunifiedgraph-textmodelforinstruction-based molecule zero-shot learning [paper]
- (arxiv'2024) XRec: Large Language Models for Explainable Recommendation [paper]
- (arxiv'2023) GraphLLM: Boosting graph reasoning ability of large language model [paper]
- (Computers in Biology and Medicine) GIT-Mol: A multi-modal large language model for molecular science with graph, image, and text [paper]
- (EMNLP'2023) MolCA: Molecular graph-language modeling with cross- modal projector and uni-modal adapter [paper]
- (arxiv'2023) InstructMol: Multi-modal integration for building a versatile and reliable molecular assistant in drug discovery [paper]
- (arxiv'2024) G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering [paper]
- (AAAI'2024) Graph neural prompting with large language models [paper]
- (arxiv'2023) Prompt-based node feature extractor for few-shot learning on text-attributed graphs [paper]
- (arxiv'2023) SimTeG: A frustratingly simple approach improves textual graph learning [paper]
- (KDD'2023) Graph-aware language model pre-training on a large graph corpus can help multiple graph applications [paper]
- (ICLR'2024) One for all: Towards training one graph model for all classification tasks [paper]
- (ICLR'2024) Harnessing explanations: Llm-to-lm interpreter for enhanced text-attributed graph representation learning [paper]
- (WSDM'2024) LLMRec: Large language models with graph augmentation for recommendation [paper]
- (arxiv'2024) OpenGraph: Towards Open Graph Foundation Models [paper]
- (arxiv'2023) Label-free node classification on graphs with large language models (LLMs) [paper]
- (arxiv'2024) GraphEdit: Large Language Models for Graph Structure Learning [paper]
- (WWW'2024) Representation learning with large language models for recommendation [paper]
- (arxiv'2022) A molecular multimodal foundation model associating molecule graphs with natural language [paper]
- (arxiv'2023) ConGraT: Self-supervised contrastive pretraining for joint graph and text embeddings [paper]
- (arxiv'2023) Prompt tuning on graph-augmented low-resource text classification [paper]
- (arxiv'2023) GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs [paper]
- (Nature Machine Intelligence'2023) Multi-modal molecule structureβtext model for text-based retrieval and editing [paper]
- (arxiv'2023) Pretraining language models with text-attributed heterogeneous graphs [paper]
- (arxiv'2022) Learning on large-scale text-attributed graphs via variational inference [paper]
- (ICLR'2022) GreaseLM: Graph reasoning enhanced language models for question answering [paper]
- (arxiv'2023) Disentangled representation learning with large language models for text-attributed graphs [paper]
- (arxiv'2024) Efficient Tuning and Inference for Large Language Models on Textual Graphs [paper]
- (WWW'2024) Can GNN be Good Adapter for LLMs? [paper]
- (ACL'2023) Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments[paper]
- (arxiv'2022) Graph Agent: Explicit Reasoning Agent for Graphs [paper]
- (arxiv'2024) Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments [paper]
- (arxiv'2023) Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments [paper]
- (ICLR'2024) Reasoning on graphs: Faithful and interpretable large language model reasoning [paper]
- (NeurIPS'2024) Can language models solve graph problems in natural language? [paper]
- (arxiv'2023) GPT4Graph: Can large language models understand graph structured data? an empirical evaluation and benchmarking [paper]
- (arxiv'2023) BeyondText:ADeepDiveinto Large Language Modelsβ Ability on Understanding Graph Data [paper]
- (KDD'2024) Exploring the potential of large language models (llms) in learning on graphs [paper]
- (arxiv'2023) Graphtext: Graph reasoning in text space [paper]
- (arxiv'2023) Talk like a graph: Encoding graphs for large language models [paper]
- (arxiv'2023) LLM4DyG:Can Large Language Models Solve Problems on Dynamic Graphs? [paper]
- (arxiv'2023) Which Modality should I useβText, Motif, or Image?: Understanding Graphs with Large Language Models [paper]
- (arxiv'2023) When Graph Data Meets Multimodal: A New Paradigm for Graph Understanding and Reasoning [paper]
- (arxiv'2023) Natural language is all a graph needs [paper]
- (NeurIPS'2024) Walklm:A uniform language model fine-tuning framework for attributed graph embedding [paper]
- (arxiv'2024) LLaGA: Large Language and Graph Assistant [paper]
- (arxiv'2024) InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment [paper]
- (arxiv'2024) ZeroG: Investigating Cross-dataset Zero-shot Transferability in Graphs [paper]
- (arxiv'2024) GraphWiz: An Instruction-Following Language Model for Graph Problems [paper]
- (arxiv'2024) GraphInstruct: Empowering Large Language Models with Graph Understanding and Reasoning Capability [paper]
- (arxiv'2024) MuseGraph: Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining [paper]
If you have come across relevant resources, feel free to submit a pull request.
- (Journal/Confernce'20XX) **paper_name** [[paper](link)]
To add a paper to the survey, please consider providing more detailed information in the PR π
GNNs as Prefix
- (Node-level Tokenization / Graph-level Tokenization)
LLMs as Prefix
- (Embs. from LLMs for GNNs / Labels from LLMs for GNNs)
LLMs-Graphs Intergration
- (Alignment between GNNs and LLMs / Fusion Training of GNNs and LLMs / LLMs Agent for Graphs)
LLMs-Only
- (Tuning-free / Tuning-required)
Please also consider providing a brief introduction about the method to help us quickly add the paper to our survey :)
The design of our README.md is inspired by Awesome-LLM-KG and Awesome-LLMs-in-Graph-tasks, thanks to their works!
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for Awesome-LLM4Graph-Papers
Similar Open Source Tools
Awesome-LLM4Graph-Papers
A collection of papers and resources about Large Language Models (LLM) for Graph Learning (Graph). Integrating LLMs with graph learning techniques to enhance performance in graph learning tasks. Categorizes approaches based on four primary paradigms and nine secondary-level categories. Valuable for research or practice in self-supervised learning for recommendation systems.
Awesome-Text2SQL
Awesome Text2SQL is a curated repository containing tutorials and resources for Large Language Models, Text2SQL, Text2DSL, Text2API, Text2Vis, and more. It provides guidelines on converting natural language questions into structured SQL queries, with a focus on NL2SQL. The repository includes information on various models, datasets, evaluation metrics, fine-tuning methods, libraries, and practice projects related to Text2SQL. It serves as a comprehensive resource for individuals interested in working with Text2SQL and related technologies.
Awesome-explainable-AI
This repository contains frontier research on explainable AI (XAI), a hot topic in the field of artificial intelligence. It includes trends, use cases, survey papers, books, open courses, papers, and Python libraries related to XAI. The repository aims to organize and categorize publications on XAI, provide evaluation methods, and list various Python libraries for explainable AI.
VideoTuna
VideoTuna is a codebase for text-to-video applications that integrates multiple AI video generation models for text-to-video, image-to-video, and text-to-image generation. It provides comprehensive pipelines in video generation, including pre-training, continuous training, post-training, and fine-tuning. The models in VideoTuna include U-Net and DiT architectures for visual generation tasks, with upcoming releases of a new 3D video VAE and a controllable facial video generation model.
rllm
rLLM (relationLLM) is a Pytorch library for Relational Table Learning (RTL) with LLMs. It breaks down state-of-the-art GNNs, LLMs, and TNNs as standardized modules and facilitates novel model building in a 'combine, align, and co-train' way using these modules. The library is LLM-friendly, processes various graphs as multiple tables linked by foreign keys, introduces new relational table datasets, and is supported by students and teachers from Shanghai Jiao Tong University and Tsinghua University.
awesome-mcp-servers
A curated list of awesome Model Context Protocol (MCP) servers that enable AI models to securely interact with local and remote resources through standardized server implementations. The list focuses on production-ready and experimental servers extending AI capabilities through file access, database connections, API integrations, and other contextual services.
kan-gpt
The KAN-GPT repository is a PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling. It provides a model for generating text based on prompts, with a focus on improving performance compared to traditional MLP-GPT models. The repository includes scripts for training the model, downloading datasets, and evaluating model performance. Development tasks include integrating with other libraries, testing, and documentation.
sglang
SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with LLMs faster and more controllable by co-designing the frontend language and the runtime system. The core features of SGLang include: - **A Flexible Front-End Language**: This allows for easy programming of LLM applications with multiple chained generation calls, advanced prompting techniques, control flow, multiple modalities, parallelism, and external interaction. - **A High-Performance Runtime with RadixAttention**: This feature significantly accelerates the execution of complex LLM programs by automatic KV cache reuse across multiple calls. It also supports other common techniques like continuous batching and tensor parallelism.
SummaryYou
Summary You is a tool that utilizes AI to summarize YouTube videos, articles, images, and documents. Users can set the length of the summary and have the option to listen to the summaries. The tool also includes a history section, intelligent paywall detection, OLED-Dark Mode, and a user-friendly Material Design 3 style UI with dynamic color themes. It uses GPT-3.5 OpenAI/Mixtral 8x7B Groq for summarization. The backend is implemented in Python with Chaquopy, and some UI designs and codes are borrowed from Seal Material color utilities.
awesome-production-llm
This repository is a curated list of open-source libraries for production large language models. It includes tools for data preprocessing, training/finetuning, evaluation/benchmarking, serving/inference, application/RAG, testing/monitoring, and guardrails/security. The repository also provides a new category called LLM Cookbook/Examples for showcasing examples and guides on using various LLM APIs.
llm-continual-learning-survey
This repository is an updating survey for Continual Learning of Large Language Models (CL-LLMs), providing a comprehensive overview of various aspects related to the continual learning of large language models. It covers topics such as continual pre-training, domain-adaptive pre-training, continual fine-tuning, model refinement, model alignment, multimodal LLMs, and miscellaneous aspects. The survey includes a collection of relevant papers, each focusing on different areas within the field of continual learning of large language models.
Recommendation-Systems-without-Explicit-ID-Features-A-Literature-Review
This repository is a collection of papers and resources related to recommendation systems, focusing on foundation models, transferable recommender systems, large language models, and multimodal recommender systems. It explores questions such as the necessity of ID embeddings, the shift from matching to generating paradigms, and the future of multimodal recommender systems. The papers cover various aspects of recommendation systems, including pretraining, user representation, dataset benchmarks, and evaluation methods. The repository aims to provide insights and advancements in the field of recommendation systems through literature reviews, surveys, and empirical studies.
For similar tasks
Awesome-LLM4Graph-Papers
A collection of papers and resources about Large Language Models (LLM) for Graph Learning (Graph). Integrating LLMs with graph learning techniques to enhance performance in graph learning tasks. Categorizes approaches based on four primary paradigms and nine secondary-level categories. Valuable for research or practice in self-supervised learning for recommendation systems.
Graph-CoT
This repository contains the source code and datasets for Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs accepted to ACL 2024. It proposes a framework called Graph Chain-of-thought (Graph-CoT) to enable Language Models to traverse graphs step-by-step for reasoning, interaction, and execution. The motivation is to alleviate hallucination issues in Language Models by augmenting them with structured knowledge sources represented as graphs.
Awesome-Graph-LLM
Awesome-Graph-LLM is a curated collection of research papers exploring the intersection of graph-based techniques with Large Language Models (LLMs). The repository aims to bridge the gap between LLMs and graph structures prevalent in real-world applications by providing a comprehensive list of papers covering various aspects of graph reasoning, node classification, graph classification/regression, knowledge graphs, multimodal models, applications, and tools. It serves as a valuable resource for researchers and practitioners interested in leveraging LLMs for graph-related tasks.
Awesome-LLM4RS-Papers
This paper list is about Large Language Model-enhanced Recommender System. It also contains some related works. Keywords: recommendation system, large language models
ai_projects
This repository contains a collection of AI projects covering various areas of machine learning. Each project is accompanied by detailed articles on the associated blog sciblog. Projects range from introductory topics like Convolutional Neural Networks and Transfer Learning to advanced topics like Fraud Detection and Recommendation Systems. The repository also includes tutorials on data generation, distributed training, natural language processing, and time series forecasting. Additionally, it features visualization projects such as football match visualization using Datashader.
ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.
onnxruntime-genai
ONNX Runtime Generative AI is a library that provides the generative AI loop for ONNX models, including inference with ONNX Runtime, logits processing, search and sampling, and KV cache management. Users can call a high level `generate()` method, or run each iteration of the model in a loop. It supports greedy/beam search and TopP, TopK sampling to generate token sequences, has built in logits processing like repetition penalties, and allows for easy custom scoring.
khoj
Khoj is an open-source, personal AI assistant that extends your capabilities by creating always-available AI agents. You can share your notes and documents to extend your digital brain, and your AI agents have access to the internet, allowing you to incorporate real-time information. Khoj is accessible on Desktop, Emacs, Obsidian, Web, and Whatsapp, and you can share PDF, markdown, org-mode, notion files, and GitHub repositories. You'll get fast, accurate semantic search on top of your docs, and your agents can create deeply personal images and understand your speech. Khoj is self-hostable and always will be.
For similar jobs
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.
spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.
Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.