LLM-Navigation

大模型学习导航

Stars: 110

Visit

LLM-Navigation is a repository dedicated to documenting learning records related to large models, including basic knowledge, prompt engineering, building effective agents, model expansion capabilities, security measures against prompt injection, and applications in various fields such as AI agent control, browser automation, financial analysis, 3D modeling, and tool navigation using MCP servers. The repository aims to organize and collect information for personal learning and self-improvement through AI exploration.

README:

LLM-Navigation

0. 写在前面

最近学了不少关于大模型的东西，但是随着学得越来越多也越来越杂，有时候想用到之前的一些东西的时候又忘了在哪里，意识到还是需要系统整理与收集一些资料，因此本仓库主要是记录平时的一些学习记录，同时也想通过AI找到另一个自我。

2025年3月16日

碎碎念：框架也是慢慢搭建

1. 基础篇

先补近期学的东西过于基础的后面慢慢补上

大模型基础知识
- 通用类
  - 【2万字】一文搞懂：大模型是怎么被训练出来的？AI大模型落地必读
    - 详细介绍了大语言模型（LLM）的基本原理、训练步骤及关键技术，包括预训练、微调、强化学习（RL）和基于人类反馈的强化学习（RLHF），并探讨了如何通过这些方法提升模型性能和减少幻觉问题。同时也介绍了Deepseek的GRPO算法对RL的优化.
- Prompt
  - Prompt Engineering Guide
    - 通过设计和优化提示词来高效利用大语言模型，可以提升其在问答、数学推理、代码生成等复杂任务中的表现。这个项目中有关于大模型的各类基础概念，新手看可能比较坐牢，可以当作后期的字典查询。
  - System Prompt与User Prompt的使用场景区别
  - 提示词优秀模板
    - 平时看到的各个地方的写的不错的提示词，统一整理到此目录下(佛系更新).
RAG(选看)
- 为什么RAG系统"一看就会，一做就废"？
  - 探讨了检索增强生成（RAG）系统在工程实践中的12个常见问题及其优化策略，涵盖数据清洗、分块处理、嵌入模型选择、元数据使用、多级索引、查询转换、检索参数设置、高级检索策略、重排模型、提示词设计和大语言模型选择等环节。
- 高阶RAG技巧：探索提升RAG系统性能的不同技巧
  - 和上一篇类似，详细介绍了通过索引优化、预检索优化、检索优化和后检索优化等高级技术来提升RAG系统性能的方法，从而提高检索准确性和生成响应的质量。
- 为什么RAG一定需要Rerank？
  - 本文探讨了在检索增强生成（RAG）系统中，当单纯依赖向量搜索和大语言模型（LLM）无法达到理想效果时，如何通过引入重排序（Rerank）技术来提升性能，解决召回率与上下文窗口限制之间的矛盾，并详细解释了Rerank的原理、优势及其在两阶段检索系统中的应用。不过没有实际例子，当科普文学习即可.
- 大模型 RAG 终极指南：信息检索 + 文本向量化 + BGE-M3 实践全解析！
  - 系统梳理了RAG技术中的关键知识点，重点解析了信息检索的三大发展阶段、三种embedding类型的工作原理及对比，以及BGE-M3模型的结构、精调方法和reranker重排序机制，为读者掌握文本向量化和信息检索技术提供了全面的理论与实践指导。
- 实操系列-RAG101
  - RAG101第一课：一个简单的RAG工作流
    - RAG101系列教程的开篇，详细讲解了如何构建一个简单的RAG系统，包括文档加载、文本拆分、嵌入处理、语义检索和检索增强生成等关键步骤，并提供了代码示例和源码链接。
  - RAG101第二课：一个简单的CSV文件RAG工作流
    - 没有基础概念，仅仅是教你如何建立一个针对CSV文件的RAG工作流。
  - RAG101第三课：通过添加验证和优化构建可靠的RAG
    - 本文介绍了 Reliable-RAG 方法，通过文档相关性过滤、幻觉检查和片段高亮等步骤改进传统 RAG 系统，提升生成答案的准确性、可靠性和透明度.
Agent & Agentic & ..
- Agent
  - 一步步教你如何构建一个通用的大模型智能体（LLM Agent）
    - 个人学完感觉非常不错的关于如何设计一个Agent的小白文，文章中详细介绍了构建通用LLM Agent的七个关键步骤：选择合适的LLM、定义控制逻辑与通信结构、制定核心指令、设计并优化工具、制定记忆管理策略、解析Agent输出以及编排下一步操作，同时指出从单Agent原型入手可以为更复杂的多Agent系统奠定基础。
  - Building effective agents
    - 个人看过的关于Agent最好的讲解文章(虽然不是手把手教你如何实现agent，自行百度操作)，本篇中分别介绍了传统工作流式的Agent原理并引入了端到端的Agent理念，深度好文.
  - AI Agent 技术栈全景图
    - 全面解析了AI智能体的技术栈和行业格局，详细阐述了从模型服务、存储、工具调用到框架设计和托管部署的各个环节.
  - 理解这10个核心概念，你就学完了OpenAI最新的开源Agents SDK
    - 这篇文章详细解析了OpenAI推出的全新开源Agents SDK的十大关键概念和特性，包括Models、Agents、Runner、Tools、Context、StructuredOutput、Handoffs、Guardrails、Tracing和Orchestrating，并通过一个贯穿始终的案例帮助我们快速掌握这个强大的Agent开发工具。
  - Why Do Multi-Agent LLM Systems Fail?
    - 通过系统分析多智能体大型语言模型系统的执行轨迹，识别出14种失败模式，并将其归类为三大类，提出了第一个结构化的多智能体系统失败分类法（MASFT），并探讨了通过改进规范、角色和对话管理以及验证策略来缓解这些失败模式的可能性和局限性。偷懒的话也可以看这篇微信公众号总结
- Agentic
  - TODO
- MOE
  - 【科普】大模型中常说的 MoE 是指什么？
    - 介绍了混合专家模型（MoE）是什么，同时分析了其工作原理、优劣势及应用领域。
- 其他
  - 全面对比AI Agent 与 Agentic AI
    - 对Agent与Agentic做了概念层的对比：AI Agent是专注于特定任务的智能体，而Agentic AI则是具备高度自主性和适应性的智能系统.
大模型拓展能力
- MCP
  - MCP官方文档
    - MCP是一个让不同的应用程序和AI模型可以更容易完成行为交互的标准，支持多种传输模式STDIO(本地)和SSE(远程)。(MCP的官方介绍SDK文档，先用起来再具体深入了解他)
  - MCP：昙花一现还是未来标准？
    - 备注：目前来看MCP的核心价值在于允许用户为不可控的AI代理添加自定义工具，无需修改底层代理逻辑，未来期待能成为和Zapier一样实现真正的低代码。在实际生产中，工具需与系统提示词(未来模型能力提升能弥补)、架构高度定制化，MCP的“即插即用”难以实现。
其他
- wx-bot大模型相关每日资料收集
  - 自动爬取的大模型群聊热点讨论内容及链接(非程序文件、仅含分享内容)
- 不再混淆了！一文揭秘MCP Server、Function Call与Agent的核心区别
- Dify 实现DeepResearch工作流拆解并再看升级版Dify能否搭建出Manus？
  - 手把手教学关于如何通过一个工作流实现DeepResearch

x. 安全篇

Prompt对抗与防护
- 攻击篇
  - Prompt
    - Prompt越狱手册
      - 关于一些常见的提示词越狱攻击手法介绍(只需要从锚点定位处开始阅读即可)
    - Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction
      - 一篇关于提示词注入的顶会文章，简单来说就是利用了LLM的建模中，对查询和补全两部分注意力的分配不同，使得当恶意指令出现在模型补全部分时会产生越狱。(好用！！)
  - 模型安全
    - 模型序列化攻击
      - 简单探讨了机器学习中多种主流模型序列化攻击的风险和防御措施
  - 应用
    - Yak支持MCP啦
      - 指挥yak干活不再是梦.
- 防护篇
  - 通过签名解决Prompt注入
    - 评价是简单粗暴高成本
攻防赋能
- OWASP
  - GenAI Red Teaming Guide
    - OWASP出品，概述了GenAI红队的关键组成部分，为网络安全专业人员、AI/ML工程师、红队从业者、风险管理人员、对抗性攻击研究人员、首席信息安全官、架构团队和业务领导者提供了可操作的见解。该指南强调了红队在四个方面的整体方法：模型评估、实施测试、基础设施评估和运行时行为分析。
- 先知安全
  - 2025

X.应用篇

Ps: 无意间看到过的比较有趣的项目都会放进来

Agent
- MobileAgent
  - 阿里巴巴通义实验室出品，简单来讲就是通过指令控制手机完成操作，目前最新版也支持了电脑端操作
- browser-use
  - 让AI控制浏览器，没记错之前manus也是用的这款.同时提供了webui版
- Browserbase
  - 一个用于运行无头浏览器的平台,原生兼容Stagehand、Playwright、Puppeteer、Selenium等框架，并有丰富的SDK方便开发.
金融
- stocks-insights-ai-agent
  - 基于LangChain、LangChain实现的股票表现可视化AI大模型股票分析
建模
- blender-mcp
  - 基于MCP实现的通过大模型3D建模，炫酷！
生活
- 浏览器 AI 阅读助手
  - 一款高度灵活的AI浏览器助手，支持多模型配置、完全自定义提示词、Mermaid图表渲染和对话式总结(因为用户高度自定义，新手还能从中顺手学习提示词的写法)。
- AI 9天完成一本书，客单价1万的全流程分享
  - 这篇文章中我们主要学习的是复杂文本生成任务的AI通用解决思路，可以概括为：明确目标→任务拆解→指令设计→流程化操作→动态调整→结果验证→服务分级

x. 工具/资源篇

Function Calling与MCP
- MCP
  - MCP Server导航站-懒狗必备不想要自己手搓服务可以直接用现成的，但也需要注意安全问题
    - Smithery.ai
      - 华裔青年Henry Mao打造的产品,目前发现交互体验最好的MCP导航网站，每个MCP Server都搭配或生成使用代码.
    - MCP.so
      - 独立开发者idoubi开发的 MCP.so 导航，收录了2k多个Server，数量庞大
    - MCPs.live
      - 没啥好说的，mcp搜索站
    - Composio MCP
      - 每个MCP都可以生成一个SSE URL，开发者技能在自己应用中集成这个MCP的能力，无需从0开发，但可能需要🪜
    - Pulse MCP
      - 已收录了1500+个Server，比较特别的是，网站提供了很多Use Case，让人更直观了解怎么用
Prompt
- PromptUP
  - 一个可以存储并分享Prompt的简单应用.
- prompt-optimizer
  - AI应用能力的关键就是能否写出优秀的提示词，我们应该学习一个优秀合格的提示词应该是什么样的，在这个项目我们重点关注源码中提示词优化的Prompt部分即可(点我直达)
Model
- Awesome-Chinese-LLM
  - 整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。

NAN

AI防洗稿思路(简单粗暴)
AI工具集
- 该收录了数百个国内外不同类型的AI工具，涵盖写作、绘画、图像、视频、办公、对话、编程、设计、音频、搜索、开发平台、法律助手、内容检测、学习网站等多个领域。

For Tasks:

Click tags to check more tools for each tasks

optimize prompts control mobile devices automate browser tasks analyze stock performance navigate mcp servers

For Jobs:

ai researcher data scientist machine learning engineer software developer security analyst

Alternative AI tools for LLM-Navigation

Similar Open Source Tools

LLM-Navigation

github

: 110

llm_interview_note

This repository provides a comprehensive overview of large language models (LLMs), covering various aspects such as their history, types, underlying architecture, training techniques, and applications. It includes detailed explanations of key concepts like Transformer models, distributed training, fine-tuning, and reinforcement learning. The repository also discusses the evaluation and limitations of LLMs, including the phenomenon of hallucinations. Additionally, it provides a list of related courses and references for further exploration.

github

: 2.1k

hongbomiao.com

hongbomiao.com is a personal research and development (R&D) lab that facilitates the sharing of knowledge. The repository covers a wide range of topics including web development, mobile development, desktop applications, API servers, cloud native technologies, data processing, machine learning, computer vision, embedded systems, simulation, database management, data cleaning, data orchestration, testing, ops, authentication, authorization, security, system tools, reverse engineering, Ethereum, hardware, network, guidelines, design, bots, and more. It provides detailed information on various tools, frameworks, libraries, and platforms used in these domains.

github

: 233

chatwiki

ChatWiki is an open-source knowledge base AI question-answering system. It is built on large language models (LLM) and retrieval-augmented generation (RAG) technologies, providing out-of-the-box data processing, model invocation capabilities, and helping enterprises quickly build their own knowledge base AI question-answering systems. It offers exclusive AI question-answering system, easy integration of models, data preprocessing, simple user interface design, and adaptability to different business scenarios.

github

: 415

AI-Catalog

AI-Catalog is a curated list of AI tools, platforms, and resources across various domains. It serves as a comprehensive repository for users to discover and explore a wide range of AI applications. The catalog includes tools for tasks such as text-to-image generation, summarization, prompt generation, writing assistance, code assistance, developer tools, low code/no code tools, audio editing, video generation, 3D modeling, search engines, chatbots, email assistants, fun tools, gaming, music generation, presentation tools, website builders, education assistants, autonomous AI agents, photo editing, AI extensions, deep face/deep fake detection, text-to-speech, startup tools, SQL-related AI tools, education tools, and text-to-video conversion.

github

: 361

Daily-DeepLearning

Daily-DeepLearning is a repository that covers various computer science topics such as data structures, operating systems, computer networks, Python programming, data science packages like numpy, pandas, matplotlib, machine learning theories, deep learning theories, NLP concepts, machine learning practical applications, deep learning practical applications, and big data technologies like Hadoop and Hive. It also includes coding exercises related to '剑指offer'. The repository provides detailed explanations and examples for each topic, making it a comprehensive resource for learning and practicing different aspects of computer science and data-related fields.

github

: 666

system-prompts-and-models-of-ai-tools

This repository contains a significant portion of the FULL official v0, Manus, and Cursor system prompts and AI models. It includes over 5,000+ lines of insights into their structure and functionality. The available files include FULL v0, v0 model.txt, v0 tools.txt, Cursor (with cursor agent.txt, cursor ask.txt, cursor edit.txt), and Manus Folder with multiple files inside.

github

: 6.5k

llm-apps-java-spring-ai

The 'LLM Applications with Java and Spring AI' repository provides samples demonstrating how to build Java applications powered by Generative AI and Large Language Models (LLMs) using Spring AI. It includes projects for question answering, chat completion models, prompts, templates, multimodality, output converters, embedding models, document ETL pipeline, function calling, image models, and audio models. The repository also lists prerequisites such as Java 21, Docker/Podman, Mistral AI API Key, OpenAI API Key, and Ollama. Users can explore various use cases and projects to leverage LLMs for text generation, vector transformation, document processing, and more.

github

: 484

ChatGPT-airport-tizi-fanqiang

This repository provides a curated list of recommended airport proxies for accessing ChatGPT and other AI tools while bypassing internet restrictions. The proxies are tested and verified to ensure reliability and stability. The readme includes detailed instructions on how to set up and use the proxies with various devices and platforms. Additionally, the repository offers advanced tutorials on upgrading to GPT-4/Plus, deploying a 24/7 ChatGPT微信机器人 server, and using Claude-3 securely and for free.

github

: 239

ai_wiki

This repository provides a comprehensive collection of resources, open-source tools, and knowledge related to quantitative analysis. It serves as a valuable knowledge base and navigation guide for individuals interested in various aspects of quantitative investing, including platforms, programming languages, mathematical foundations, machine learning, deep learning, and practical applications. The repository is well-structured and organized, with clear sections covering different topics. It includes resources on system platforms, programming codes, mathematical foundations, algorithm principles, machine learning, deep learning, reinforcement learning, graph networks, model deployment, and practical applications. Additionally, there are dedicated sections on quantitative trading and investment, as well as large models. The repository is actively maintained and updated, ensuring that users have access to the latest information and resources.

github

: 346

ComfyUI_Yvann-Nodes

ComfyUI_Yvann-Nodes is a pack of custom nodes that enable audio reactivity within ComfyUI, allowing users to create AI-driven animations that sync with music. Users can generate audio reactive AI videos, control AI generation styles, content, and composition with any audio input. The tool is simple to use by dropping workflows in ComfyUI and specifying audio and visual inputs. It is flexible and works with existing ComfyUI AI tech and nodes like IPAdapter, AnimateDiff, and ControlNet. Users can pick workflows for Images → Video or Video → Video, download the corresponding .json file, drop it into ComfyUI, install missing custom nodes, set inputs, and generate audio-reactive animations.

github

: 340

AiLearning-Theory-Applying

This repository provides a comprehensive guide to understanding and applying artificial intelligence (AI) theory, including basic knowledge, machine learning, deep learning, and natural language processing (BERT). It features detailed explanations, annotated code, and datasets to help users grasp the concepts and implement them in practice. The repository is continuously updated to ensure the latest information and best practices are covered.

github

: 2.9k

awesome-llm-plaza

Awesome LLM plaza is a curated list of awesome LLM papers, projects, and resources. It is updated daily and includes resources from a variety of sources, including huggingface daily papers, twitter, github trending, paper with code, weixin, etc.

github

: 191

DeepBattler

DeepBattler is a tool designed for Hearthstone Battlegrounds players, providing real-time strategic advice and insights to improve gameplay experience. It integrates with the Hearthstone Deck Tracker plugin and offers voice-assisted guidance. The tool is powered by a large language model (LLM) and can match the strength of top players on EU servers. Users can set up the tool by adding dependencies, configuring the plugin path, and launching the LLM agent. DeepBattler is licensed for personal, educational, and non-commercial use, with guidelines on non-commercial distribution and acknowledgment of external contributions.

github

: 88

J.A.R.V.I.S.2.0

github

: 123

Thinking_in_Java_MindMapping

Thinking_in_Java_MindMapping is a repository that started as a project to create mind maps based on the book 'Java Programming Ideas'. Over time, it evolved into a collection of programming notes, blog posts, book summaries, personal reflections, and even gaming content. The repository covers a wide range of topics, allowing the author to freely express thoughts and ideas. The content is diverse and reflects the author's dedication to consistency and creativity.

github

: 1.6k

For similar tasks

FinAnGPT-Pro

FinAnGPT-Pro is a financial data downloader and AI query system that downloads quarterly and annual financial data for stocks from EOD Historical Data, storing it in MongoDB and Google BigQuery. It includes an AI-powered natural language interface for querying financial data. Users can set up the tool by following the prerequisites and setup instructions provided in the README. The tool allows users to download financial data for all stocks in a watchlist or for a single stock, query financial data using a natural language interface, and receive responses in a structured format. Important considerations include error handling, rate limiting, data validation, BigQuery costs, MongoDB connection, and security measures for API keys and credentials.

github

: 188

LLM-Navigation

github

: 110

Botright

Botright is a tool designed for browser automation that focuses on stealth and captcha solving. It uses a real Chromium-based browser for enhanced stealth and offers features like browser fingerprinting and AI-powered captcha solving. The tool is suitable for developers looking to automate browser tasks while maintaining anonymity and bypassing captchas. Botright is available in async mode and can be easily integrated with existing Playwright code. It provides solutions for various captchas such as hCaptcha, reCaptcha, and GeeTest, with high success rates. Additionally, Botright offers browser stealth techniques and supports different browser functionalities for seamless automation.

github

: 396

CoolCline

CoolCline is a proactive programming assistant that combines the best features of Cline, Roo Code, and Bao Cline. It seamlessly collaborates with your command line interface and editor, providing the most powerful AI development experience. It optimizes queries, allows quick switching of LLM Providers, and offers auto-approve options for actions. Users can configure LLM Providers, select different chat modes, perform file and editor operations, integrate with the command line, automate browser tasks, and extend capabilities through the Model Context Protocol (MCP). Context mentions help provide explicit context, and installation is easy through the editor's extension panel or by dragging and dropping the `.vsix` file. Local setup and development instructions are available for contributors.

github

: 132

cursor-tools

cursor-tools is a CLI tool designed to enhance AI agents with advanced skills, such as web search, repository context, documentation generation, GitHub integration, Xcode tools, and browser automation. It provides features like Perplexity for web search, Gemini 2.0 for codebase context, and Stagehand for browser operations. The tool requires API keys for Perplexity AI and Google Gemini, and supports global installation for system-wide access. It offers various commands for different tasks and integrates with Cursor Composer for AI agent usage.

github

: 3.5k

log10

Log10 is a one-line Python integration to manage your LLM data. It helps you log both closed and open-source LLM calls, compare and identify the best models and prompts, store feedback for fine-tuning, collect performance metrics such as latency and usage, and perform analytics and monitor compliance for LLM powered applications. Log10 offers various integration methods, including a python LLM library wrapper, the Log10 LLM abstraction, and callbacks, to facilitate its use in both existing production environments and new projects. Pick the one that works best for you. Log10 also provides a copilot that can help you with suggestions on how to optimize your prompt, and a feedback feature that allows you to add feedback to your completions. Additionally, Log10 provides prompt provenance, session tracking and call stack functionality to help debug prompt chains. With Log10, you can use your data and feedback from users to fine-tune custom models with RLHF, and build and deploy more reliable, accurate and efficient self-hosted models. Log10 also supports collaboration, allowing you to create flexible groups to share and collaborate over all of the above features.

github

: 96

LMOps

LMOps is a research initiative focusing on fundamental research and technology for building AI products with foundation models, particularly enabling AI capabilities with Large Language Models (LLMs) and Generative AI models. The project explores various aspects such as prompt optimization, longer context handling, LLM alignment, acceleration of LLMs, LLM customization, and understanding in-context learning. It also includes tools like Promptist for automatic prompt optimization, Structured Prompting for efficient long-sequence prompts consumption, and X-Prompt for extensible prompts beyond natural language. Additionally, LLMA accelerators are developed to speed up LLM inference by referencing and copying text spans from documents. The project aims to advance technologies that facilitate prompting language models and enhance the performance of LLMs in various scenarios.

github

: 3.6k

awesome-llm-json

This repository is an awesome list dedicated to resources for using Large Language Models (LLMs) to generate JSON or other structured outputs. It includes terminology explanations, hosted and local models, Python libraries, blog articles, videos, Jupyter notebooks, and leaderboards related to LLMs and JSON generation. The repository covers various aspects such as function calling, JSON mode, guided generation, and tool usage with different providers and models.

github

: 1.9k

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 855

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.3k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 30.6k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675