awesome-LLM-game-agent-papers

A Survey on Large Language Model-Based Game Agents

Stars: 469

Visit

This repository provides a comprehensive survey of research papers on large language model (LLM)-based game agents. LLMs are powerful AI models that can understand and generate human language, and they have shown great promise for developing intelligent game agents. This survey covers a wide range of topics, including adventure games, crafting and exploration games, simulation games, competition games, cooperation games, communication games, and action games. For each topic, the survey provides an overview of the state-of-the-art research, as well as a discussion of the challenges and opportunities for future work.

README:

A Survey on Large Language Model-Based Game Agents

🔥 Must-read papers for LLM-based Game agents.

💫 Continuously update on a weekly basis. (last update: 2025/02/21)

Content

A Survey on Large Language Model-Based Game Agents

Adventure Games

Text Adventure Games

[2019/09] Interactive Fiction Games: A Colossal Adventure AAAI 2020 [paper] [code]
[2020/10] ALFWorld: Aligning Text and Embodied Environments for Interactive Learning ICLR 2021 [paper][code]
[2022/03] ScienceWorld: Is your Agent Smarter than a 5th Grader? EMNLP 2022 [paper] [code]
[2022/10] ReAct: Synergizing Reasoning and Acting in Language Models ICLR 2023 [paper] [code]
[2023/03] Reflexion: Language Agents with Verbal Reinforcement Learning NeurIPS 2023 [paper] [code]
[2023/04] Can Large Language Models Play Text Games Well? Current State-of-the-Art and Open Questions arXiv [paper]
[2023/05] SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks NeurIPS 2023 [paper] [code]
[2023/10] FireAct: Toward Language Agent Fine-tuning arXiv [paper][code]
[2023/11] ADaPT: As-Needed Decomposition and Planning with Language Models arXiv [paper][code]
[2024/02] Soft Self-Consistency Improves Language Model Agents arXiv [paper][code]
[2024/02] Empowering Large Language Model Agents through Action Learning arXiv [paper][code]
[2024/03] KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents arXiv [paper][code]
[2024/03] Language Guided Exploration for RL Agents in Text Environments arXiv [paper][code]
[2024/03] Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents ACL 2024 [paper][code]
[2024/04] Learning From Failure: Integrating Negative Examples When Fine-tuning Large Language Models as Agent arXiv[paper][code]
[2024/04] ReAct Meets ActRe: When Language Agents Enjoy Training Data Autonomy [paper]
[2024/05] Agent Planning with World Knowledge Model arXiv [paper][code]
[2024/05] THREAD: Thinking Deeper with Recursive Spawning arXiv [paper]
[2024/06] Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement arXiv [paper][code]
[2024/06] STARLING: Self-supervised Training of Text-based Reinforcement Learning Agent with Large Language Models arXiv [paper][code]
[2024/07] AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents ACL 2024 [paper][code]

Video Adventure Games

[2023/09] Motif: Intrinsic Motivation from Artificial Intelligence Feedback ICLR 2024 [paper] [code]
[2024/03] Cradle: Empowering Foundation Agents Towards General Computer Control arXiv [paper][code]
[2024/03] Playing NetHack with LLMs: Potential & Limitations as Zero-Shot Agents arXiv [paper] [code]

Crafting & Exploration Games

MineCraft

[2023/02] Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents NeurIPS 2023 [paper][code]
[2023/03] Plan4MC: Skill Reinforcement Learning and Planning for Open-World Minecraft Tasks FMDM@NeurIPS2023 [paper][code]
[2023/05] Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory arXiv [paper]
[2023/05] VOYAGER: An Open-Ended Embodied Agent with Large Language Models FMDM@NeurIPS2023 [paper][code]
[2023/10] LLaMA Rider: Spurring Large Language Models to Explore the Open World arXiv [paper][code]
[2023/10] Steve-Eye: Equipping LLM-based Embodied Agents with Visual Perception in Open Worlds ICLR 2024 [paper]
[2023/11] JARVIS-1: Open-world Multi-task Agents with Memory-Augmented Multimodal Language Models arXiv [paper][code]
[2023/11] See and Think: Embodied Agent in Virtual Environment arXiv [paper][code]
[2023/12] MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception CVPR 2024 [paper][code]
[2023/12] Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft arXiv [paper]
[2023/12] Creative Agents: Empowering Agents with Imagination for Creative Tasks arXiv [paper][code]
[2024/02] RL-GPT: Integrating Reinforcement Learning and Code-as-policy arXiv [paper]
[2024/03] MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control arXiv [paper][code]
[2024/07] Odyssey: Empowering Agents with Open-World Skills. arXiv [paper][code]

Crafter

[2023/02] Guiding Pretraining in Reinforcement Learning with Large Language Models ICML 2023 [paper]
[2023/05] SPRING: Studying Papers and Reasoning to play Games NeurIPS 2023 [paper]
[2023/06] OMNI: Open-endedness via Models of human Notions of Interestingness arXiv [paper][code]
[2023/09] AdaRefiner: Refining Decisions of Language Models with Adaptive Feedback arXiv [paper]
[2024/03] EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents arXiv [paper]
[2024/04] AgentKit: Flow Engineering with Graphs, not Coding arXiv [paper][code]
[2024/04] World Models with Hints of Large Language Models for Goal Achieving arXiv [paper]
[2024/07] Enhancing Agent Learning through World Dynamics Modeling arXiv [paper]
[2024/10] Mars: Situated Inductive Reasoning in an Open-World Environment NeurIPS 2024 [paper]

Simulation Games

Human/social Simulation

[2023/04] Generative Agents: Interactive Simulacra of Human Behavior UIST 2023 [paper][code]
[2023/08] AgentSims: An Open-Source Sandbox for Large Language Model Evaluation arXiv [paper]
[2023/10] Humanoid Agents: Platform for Simulating Human-like Generative Agents arXiv [paper]
[2023/10] Lyfe Agents: Generative agents for low-cost real-time social interactions arXiv [paper]
[2023/10] SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents arXiv [paper][code]
[2024/03] SOTOPIA-$\pi$: Interactive Learning of Socially Intelligent Language Agents arXiv [paper][code]
[2024/05] "Agent hospital: A simulacrum of hospital with evolvable medical agents." arXiv [paper]
[2024/10] Project Sid: Many-agent simulations toward AI civilization [paper website]
[2024/10] GenSim: A General Social Simulation Platform with Large Language Model based Agents arXiv [paper][code]

Embodied Simulation

[2022/01] Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents ICML 2022 [paper][code]
[2022/12] LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models ICCV 2023 [paper]
[2023/05] Language Models Meet World Models: Embodied Experiences Enhance Language Models NeurIPS 2023 [paper][code]
[2023/10] Octopus: Embodied Vision-Language Programmer from Environmental Feedback arXiv [paper] [code]
[2024/01] True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning arXiv[paper][code]

Other Simulation

[2024/01] CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents ICLR 2024 [paper][code]

Competition Games

[2022/10] Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task ICLR 2023 [paper]
[2023/06] ChessGPT: Bridging Policy Learning and Language Modeling NeurIPS 2023 [paper][code]
[2023/08] Are ChatGPT and GPT-4 Good Poker Players?--A Pre-Flop Analysis arXiv [paper]
[2023/09] Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4 arXiv [paper]
[2023/12] Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach arXiv [paper][code]
[2024/01] PokerGPT: An End-to-End Lightweight Solver for Multi-Player Texas Hold'em via Large Language Model arXiv [paper]
[2024/01] SwarmBrain: Embodied agent for real-time strategy game StarCraft II via large language models arXiv [paper]
[2024/02] PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large Language Models arXiv [paper][code]
[2024/02] Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization arXiv [paper][code]
[2024/03] Embodied LLM Agents Learn to Cooperate in Organized Teams arXiv [paper]
[2024/08] Evaluating and Enhancing LLMs Agent based on Theory of Mind in Guandan: A Multi-Player Cooperative Game under Imperfect Information arXiv [paper]
[2025/01] POKERBENCH: Training Large Language Models to become Professional Poker Players arXiv [paper]

Cooperation Games

[2023/07] Building Cooperative Embodied Agents Modularly with Large Language Models ICLR 2024 [paper][code]
[2023/09] MindAgent: Emergent Gaming Interaction arXiv [paper]
[2023/10] Evaluating Multi-agent Coordination Abilities in Large Language Models arXiv [paper]
[2023/12] LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination arXiv [paper]
[2024/02] S-Agents: Self-organizing Agents in Open-ended Environments arXiv [paper]
[2024/03] ProAgent: Building Proactive Cooperative Agents with Large Language Models AAAI 2024 [paper]
[2024/03] Can LLM-Augmented Autonomous Agents Cooperate?, An Evaluation of Their Cooperative Capabilities through Melting Pot arXiv [paper]
[2024/03] Hierarchical Auto-Organizing System for Open-Ended Multi-Agent Navigation arXiv[paper]
[2024/05] Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration arXiv[paper][code]

Communication (Conversational) Games

[2022/12] Human-Level Play in the Game of Diplomacy by Combining Language Models with Strategic Reasoning Science [paper]
[2023/08] GameEval: Evaluating LLMs on Conversational Games arXiv [paper][code]
[2023/09] Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf arXiv [paper]
[2023/10] Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game arXiv [paper]
[2023/10] Avalon's Game of Thoughts: Battle Against Deception through Recursive Contemplation arXiv [paper]
[2023/10] AvalonBench: Evaluating LLMs Playing the Game of Avalon FMDM@NeurIPS2023 [paper][code]
[2023/10] LLM-Based Agent Society Investigation: Collaboration and Confrontation in Avalon Gameplay arXiv [paper]
[2023/10] Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models arXiv [paper][code]
[2023/11] War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars arXiv [paper][code]
[2023/11] clembench: Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents EMNLP 2023 [paper]
[2023/12] Can Large Language Models Serve as Rational Players in Game Theory? A Systematic Analysis AAAI 2024 [paper]
[2023/12] Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in the Avalon Game arXiv [paper]
[2023/12] Deciphering Digital Detectives: Understanding LLM Behaviors and Capabilities in Multi-Agent Mystery Games [paper]
[2024/02] Enhance Reasoning for Large Language Models in the Game Werewolf arXiv [paper]
[2024/02] What if LLMs Have Different World Views: Simulating Alien Civilizations with LLM-based Agents arXiv [paper]
[2024/02] Can Large Language Model Agents Simulate Human Trust Behaviors? arXiv [paper]
[2024/02] Large Language Models Fall Short: Understanding Complex Relationships in Detective Narratives arXiv [paper]
[2024/04] Self-playing Adversarial Language Game Enhances LLM Reasoning [paper][code]
[2024/06] PLAYER: Enhancing LLM-based Multi-Agent Communication and Interaction in Murder Mystery Games arXiv[paper]
[2024/07] AMONGAGENTS: Evaluating Large Language Models in the Interactive Text-Based Social Deduction Game arXiv [paper]

Action Games

[2023/02] Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning ICML 2023 [paper][code]
[2024/03] Cradle: Empowering Foundation Agents Towards General Computer Control arXiv [paper][code]
[2024/03] Will GPT-4 Run DOOM? arXiv [paper][code]
[2024/03] Evaluate LLMs in Real Time with Street Fighter III GitHub [code]
[2024/07] Baba Is AI: Break the Rules to Beat the Benchmark ICML 2024 [paper]
[2024/08] Atari-GPT: Investigating the Capabilities of Multimodal Large Language Models as Low-Level Policies for Atari Games arXiv [paper]
[2024/09] Can VLMs Play Action Role-Playing Games? Take Black Myth Wukong as a Study Case arXiv [paper] [code]
[2024/08] AMONGAGENTS: Evaluating Large Language Models in the Interactive Text-Based Social Deduction Game arXiv [paper]
[2024/10] Unbounded: A Generative Infinite Game of Character Life Simulation arXiv [paper]

Dialogue & Story & Game Generation

[2023/10] Language as reality: a co-creative storytelling game experience in 1001 nights using generative AI. AAAI 2023 [paper][demo on Steam]
[2024/07] What if Red Can Talk? Dynamic Dialogue Generation Using Large Language Models. arXiv [paper]

Citation

If you find this repository useful, please cite our paper:

@misc{hu2024survey,
      title={A Survey on Large Language Model-Based Game Agents}, 
      author={Sihao Hu and Tiansheng Huang and Fatih Ilhan and Selim Tekin and Gaowen Liu and Ramana Kompella and Ling Liu},
      year={2024},
      eprint={2404.02039},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

Contact

If you discover any papers that are suitable but not included, please contact Sihao Hu ([email protected]). You can also cite our survey, as we will periodically check for new papers citing it and update them into the GitHub list if related.

For Tasks:

Click tags to check more tools for each tasks

play games generate text answer questions translate languages write code

For Jobs:

game designer ai researcher software engineer data scientist machine learning engineer

Alternative AI tools for awesome-LLM-game-agent-papers

Similar Open Source Tools

awesome-LLM-game-agent-papers

github

: 469

awesome_LLM-harmful-fine-tuning-papers

This repository is a comprehensive survey of harmful fine-tuning attacks and defenses for large language models (LLMs). It provides a curated list of must-read papers on the topic, covering various aspects such as alignment stage defenses, fine-tuning stage defenses, post-fine-tuning stage defenses, mechanical studies, benchmarks, and attacks/defenses for federated fine-tuning. The repository aims to keep researchers updated on the latest developments in the field and offers insights into the vulnerabilities and safeguards related to fine-tuning LLMs.

github

: 145

Awesome-Robotics-3D

Awesome-Robotics-3D is a curated list of 3D Vision papers related to Robotics domain, focusing on large models like LLMs/VLMs. It includes papers on Policy Learning, Pretraining, VLM and LLM, Representations, and Simulations, Datasets, and Benchmarks. The repository is maintained by Zubair Irshad and welcomes contributions and suggestions for adding papers. It serves as a valuable resource for researchers and practitioners in the field of Robotics and Computer Vision.

github

: 474

awesome-and-novel-works-in-slam

This repository contains a curated list of cutting-edge works in Simultaneous Localization and Mapping (SLAM). It includes research papers, projects, and tools related to various aspects of SLAM, such as 3D reconstruction, semantic mapping, novel algorithms, large-scale mapping, and more. The repository aims to showcase the latest advancements in SLAM technology and provide resources for researchers and practitioners in the field.

github

: 59

Awesome-LLM-Interpretability

Awesome-LLM-Interpretability is a curated list of materials related to LLM (Large Language Models) interpretability, covering tutorials, code libraries, surveys, videos, papers, and blogs. It includes resources on transformer mechanistic interpretability, visualization, interventions, probing, fine-tuning, feature representation, learning dynamics, knowledge editing, hallucination detection, and redundancy analysis. The repository aims to provide a comprehensive overview of tools, techniques, and methods for understanding and interpreting the inner workings of large language models.

github

: 130

Awesome_papers_on_LLMs_detection

This repository is a curated list of papers focused on the detection of Large Language Models (LLMs)-generated content. It includes the latest research papers covering detection methods, datasets, attacks, and more. The repository is regularly updated to include the most recent papers in the field.

github

: 147

ABigSurveyOfLLMs

ABigSurveyOfLLMs is a repository that compiles surveys on Large Language Models (LLMs) to provide a comprehensive overview of the field. It includes surveys on various aspects of LLMs such as transformers, alignment, prompt learning, data management, evaluation, societal issues, safety, misinformation, attributes of LLMs, efficient LLMs, learning methods for LLMs, multimodal LLMs, knowledge-based LLMs, extension of LLMs, LLMs applications, and more. The repository aims to help individuals quickly understand the advancements and challenges in the field of LLMs through a collection of recent surveys and research papers.

github

: 177

Awesome-LLM-Robotics

This repository contains a curated list of **papers using Large Language/Multi-Modal Models for Robotics/RL**. Template from awesome-Implicit-NeRF-Robotics Please feel free to send me pull requests or email to add papers! If you find this repository useful, please consider citing and STARing this list. Feel free to share this list with others! ## Overview * Surveys * Reasoning * Planning * Manipulation * Instructions and Navigation * Simulation Frameworks * Citation

github

: 3.5k

Awesome_Test_Time_LLMs

This repository focuses on test-time computing, exploring various strategies such as test-time adaptation, modifying the input, editing the representation, calibrating the output, test-time reasoning, and search strategies. It covers topics like self-supervised test-time training, in-context learning, activation steering, nearest neighbor models, reward modeling, and multimodal reasoning. The repository provides resources including papers and code for researchers and practitioners interested in enhancing the reasoning capabilities of large language models.

github

: 69

LLM-Agents-Papers

A repository that lists papers related to Large Language Model (LLM) based agents. The repository covers various topics including survey, planning, feedback & reflection, memory mechanism, role playing, game playing, tool usage & human-agent interaction, benchmark & evaluation, environment & platform, agent framework, multi-agent system, and agent fine-tuning. It provides a comprehensive collection of research papers on LLM-based agents, exploring different aspects of AI agent architectures and applications.

github

: 1.3k

LLM-Agent-Paper-List

github

: 5.8k

LLM-in-Vision

Recent LLM (Large Language Models)-based CV and multi-modal works.

github

: 743

Paper-Reading-ConvAI

Paper-Reading-ConvAI is a repository that contains a list of papers, datasets, and resources related to Conversational AI, mainly encompassing dialogue systems and natural language generation. This repository is constantly updating.

github

: 1.0k

llm-misinformation-survey

The 'llm-misinformation-survey' repository is dedicated to the survey on combating misinformation in the age of Large Language Models (LLMs). It explores the opportunities and challenges of utilizing LLMs to combat misinformation, providing insights into the history of combating misinformation, current efforts, and future outlook. The repository serves as a resource hub for the initiative 'LLMs Meet Misinformation' and welcomes contributions of relevant research papers and resources. The goal is to facilitate interdisciplinary efforts in combating LLM-generated misinformation and promoting the responsible use of LLMs in fighting misinformation.

github

: 68

Everything-LLMs-And-Robotics

The Everything-LLMs-And-Robotics repository is the world's largest GitHub repository focusing on the intersection of Large Language Models (LLMs) and Robotics. It provides educational resources, research papers, project demos, and Twitter threads related to LLMs, Robotics, and their combination. The repository covers topics such as reasoning, planning, manipulation, instructions and navigation, simulation frameworks, perception, and more, showcasing the latest advancements in the field.

github

: 718

awesome-llm-attributions

This repository focuses on unraveling the sources that large language models tap into for attribution or citation. It delves into the origins of facts, their utilization by the models, the efficacy of attribution methodologies, and challenges tied to ambiguous knowledge reservoirs, biases, and pitfalls of excessive attribution.

github

: 152

For similar tasks

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

onnxruntime-genai

ONNX Runtime Generative AI is a library that provides the generative AI loop for ONNX models, including inference with ONNX Runtime, logits processing, search and sampling, and KV cache management. Users can call a high level `generate()` method, or run each iteration of the model in a loop. It supports greedy/beam search and TopP, TopK sampling to generate token sequences, has built in logits processing like repetition penalties, and allows for easy custom scoring.

github

: 442

jupyter-ai

Jupyter AI connects generative AI with Jupyter notebooks. It provides a user-friendly and powerful way to explore generative AI models in notebooks and improve your productivity in JupyterLab and the Jupyter Notebook. Specifically, Jupyter AI offers: * An `%%ai` magic that turns the Jupyter notebook into a reproducible generative AI playground. This works anywhere the IPython kernel runs (JupyterLab, Jupyter Notebook, Google Colab, Kaggle, VSCode, etc.). * A native chat UI in JupyterLab that enables you to work with generative AI as a conversational assistant. * Support for a wide range of generative model providers, including AI21, Anthropic, AWS, Cohere, Gemini, Hugging Face, NVIDIA, and OpenAI. * Local model support through GPT4All, enabling use of generative AI models on consumer grade machines with ease and privacy.

github

: 3.5k

khoj

Khoj is an open-source, personal AI assistant that extends your capabilities by creating always-available AI agents. You can share your notes and documents to extend your digital brain, and your AI agents have access to the internet, allowing you to incorporate real-time information. Khoj is accessible on Desktop, Emacs, Obsidian, Web, and Whatsapp, and you can share PDF, markdown, org-mode, notion files, and GitHub repositories. You'll get fast, accurate semantic search on top of your docs, and your agents can create deeply personal images and understand your speech. Khoj is self-hostable and always will be.

github

: 28.5k

langchain_dart

LangChain.dart is a Dart port of the popular LangChain Python framework created by Harrison Chase. LangChain provides a set of ready-to-use components for working with language models and a standard interface for chaining them together to formulate more advanced use cases (e.g. chatbots, Q&A with RAG, agents, summarization, extraction, etc.). The components can be grouped into a few core modules: * **Model I/O:** LangChain offers a unified API for interacting with various LLM providers (e.g. OpenAI, Google, Mistral, Ollama, etc.), allowing developers to switch between them with ease. Additionally, it provides tools for managing model inputs (prompt templates and example selectors) and parsing the resulting model outputs (output parsers). * **Retrieval:** assists in loading user data (via document loaders), transforming it (with text splitters), extracting its meaning (using embedding models), storing (in vector stores) and retrieving it (through retrievers) so that it can be used to ground the model's responses (i.e. Retrieval-Augmented Generation or RAG). * **Agents:** "bots" that leverage LLMs to make informed decisions about which available tools (such as web search, calculators, database lookup, etc.) to use to accomplish the designated task. The different components can be composed together using the LangChain Expression Language (LCEL).

github

: 497

danswer

Danswer is an open-source Gen-AI Chat and Unified Search tool that connects to your company's docs, apps, and people. It provides a Chat interface and plugs into any LLM of your choice. Danswer can be deployed anywhere and for any scale - on a laptop, on-premise, or to cloud. Since you own the deployment, your user data and chats are fully in your own control. Danswer is MIT licensed and designed to be modular and easily extensible. The system also comes fully ready for production usage with user authentication, role management (admin/basic users), chat persistence, and a UI for configuring Personas (AI Assistants) and their Prompts. Danswer also serves as a Unified Search across all common workplace tools such as Slack, Google Drive, Confluence, etc. By combining LLMs and team specific knowledge, Danswer becomes a subject matter expert for the team. Imagine ChatGPT if it had access to your team's unique knowledge! It enables questions such as "A customer wants feature X, is this already supported?" or "Where's the pull request for feature Y?"

github

: 10.5k

infinity

Infinity is an AI-native database designed for LLM applications, providing incredibly fast full-text and vector search capabilities. It supports a wide range of data types, including vectors, full-text, and structured data, and offers a fused search feature that combines multiple embeddings and full text. Infinity is easy to use, with an intuitive Python API and a single-binary architecture that simplifies deployment. It achieves high performance, with 0.1 milliseconds query latency on million-scale vector datasets and up to 15K QPS.

github

: 3.3k

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 855

LLMStack

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.3k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 30.6k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675