TrainPPTAgent
模版式PPT,可以生成套用模版的PPT
Stars: 81
TrainPPTAgent is an AI-based intelligent presentation generation tool. Users can input a topic and the system will automatically generate a well-structured and content-rich PPT outline and page-by-page content. The project adopts a front-end and back-end separation architecture: the front-end is responsible for interaction, outline editing, and template selection, while the back-end leverages large language models (LLM) and reinforcement learning (GRPO) to complete content generation and optimization, making the generated PPT more tailored to user goals.
README:
TrainPPTAgent 是一款基于 AI 的智能演示文稿生成工具。用户只需输入主题,系统即可自动生成结构完整、内容丰富的 PPT 大纲与逐页内容。项目采用 前后端分离架构:前端负责交互、大纲编辑与模板选择,后端则借助大语言模型(LLM)与强化学习(GRPO)完成内容生成与优化,使生成的 PPT 更贴合用户目标。
English:README_EN.md
强化学习训练代码请参见另外项目: 👉 PPT 模型训练代码
-
智能大纲生成 输入主题后,自动生成逻辑清晰、结构合理的演示文稿大纲。
-
逐页内容生成 采用流式传输技术,实现 PPT 内容的实时生成与展示,提升交互体验。
-
模板支持 提供多种模板供用户选择,支持内容与样式的分离式填充。
-
前后端分离架构 前端使用 Vue.js + Vite + TypeScript,后端基于 Python (Flask/FastAPI),架构清晰、可扩展性强。
-
强化学习驱动 引入 GRPO 强化学习方法,优化 PPT Agent 的生成效果,使结果更符合用户需求。
- 前端: Vue.js, Vite, TypeScript
- 后端: Python, Flask/FastAPI, A2A, ADK, MCP 搜索
- AI 模型: 大语言模型(用于大纲与内容生成)
TrainPPTAgent/
├── backend/ # 后端代码
│ ├── main_api/ # 核心 API 服务
│ ├── slide_agent/ # AI Agent 逻辑
│ └── ...
├── frontend/ # 前端代码
│ ├── src/
│ │ ├── views/ # 页面组件(大纲、编辑等)
│ │ ├── services/ # API 调用服务
│ │ └── ...
│ └── vite.config.ts # 前端配置
└── doc/ # 项目文档
├── API_*.md # API 接口文档
└── ...
使用我们提供的启动脚本,可以一键启动所有后端服务:
cd backend
pip install -r requirements.txt
python start_backend.py功能特性:
- ✅ 自动检查Python版本和依赖
- ✅ 自动安装所需包
- ✅ 端口占用检测和清理(需要用户确认)
- ✅ 自动设置环境文件
- ✅ 多进程管理和监控
-
进入后端目录:
cd backend -
安装依赖:
pip install -r requirements.txt
-
启动主 API 服务(默认运行在
http://127.0.0.1:6800):cd main_api cp env_template .env python main.py -
启动大纲生成服务(默认运行在
http://127.0.0.1:10001):cd backend/simpleOutline cp env_template .env #复制完成后,修改.env文件 python main_api.py
-
启动 PPT 内容生成服务(默认运行在
http://127.0.0.1:10011):cd backend/slide_agent cp env_template .env #复制完成后,修改.env文件 修改每个Agent的模型 backend/slide_agent/slide_agent/config.py python main_api.py
详细说明: 请参考 backend/启动说明.md
-
进入前端目录:
cd frontend -
安装依赖:
npm install
-
启动开发服务器(默认运行在
http://127.0.0.1:5173):npm run dev
提示: 前端通过 Vite 代理与后端 API 通信,具体配置请查看
frontend/vite.config.ts。
docker compose up
- 输入主题 → 用户在前端输入主题
-
生成大纲 → 调用
/api/tools/aippt_outline,生成 Markdown 格式的大纲 -
生成内容 → 调用
/api/tools/aippt,结合模板逐页生成内容 - 实时渲染 → 前端渲染并展示完整 PPT
flowchart TD
U((用户)) --> FE[前端界面]
FE -->|输入主题| API[后端 API]
API -->|调用大纲服务| Outline[大纲服务]
Outline -->|调用 Web搜索| WebSearch1[Web 搜索]
Outline --> API --> FE
FE -->|确认大纲| API --> PPTGen[PPT生成服务]
PPTGen -->|调用 Web搜索| WebSearch2[Web 搜索]
PPTGen -->|调用 配图搜索| ImgSearch[配图搜索]
PPTGen --> API --> FE
FE -->|渲染展示 PPT| U- [ ] 表格的支持
- [ ] 支持上传自定义 PPT 模板并自动标注
- 更新日志
- 自定义模板说明
- 前端引用项目(本项目免版权,但前端部分需注意版权): https://github.com/pipipi-pikachu/PPTist
- 模版制作
- 不同的模型配置
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for TrainPPTAgent
Similar Open Source Tools
TrainPPTAgent
TrainPPTAgent is an AI-based intelligent presentation generation tool. Users can input a topic and the system will automatically generate a well-structured and content-rich PPT outline and page-by-page content. The project adopts a front-end and back-end separation architecture: the front-end is responsible for interaction, outline editing, and template selection, while the back-end leverages large language models (LLM) and reinforcement learning (GRPO) to complete content generation and optimization, making the generated PPT more tailored to user goals.
InterPilot
InterPilot is an AI-based assistant tool that captures audio from Windows input/output devices, transcribes it into text, and then calls the Large Language Model (LLM) API to provide answers. The project includes recording, transcription, and AI response modules, aiming to provide support for personal legitimate learning, work, and research. It may assist in scenarios like interviews, meetings, and learning, but it is strictly for learning and communication purposes only. The tool can hide its interface using third-party tools to prevent screen recording or screen sharing, but it does not have this feature built-in. Users bear the risk of using third-party tools independently.
MarkMap-OpenAi-ChatGpt
MarkMap-OpenAi-ChatGpt is a Vue.js-based mind map generation tool that allows users to generate mind maps by entering titles or content. The application integrates the markmap-lib and markmap-view libraries, supports visualizing mind maps, and provides functions for zooming and adapting the map to the screen. Users can also export the generated mind map in PNG, SVG, JPEG, and other formats. This project is suitable for quickly organizing ideas, study notes, project planning, etc. By simply entering content, users can get an intuitive mind map that can be continuously expanded, downloaded, and shared.
aitii-tekisuto
aitii-tekisuto is a unified technical documentation platform for AD domain control and data communication networks, covering architecture design, deployment, security hardening, and daily operation and maintenance. It helps you quickly build and maintain a stable and reliable enterprise network environment. The project uses MkDocs Material to provide a modern documentation site experience, integrating article content encryption functionality and smooth page transition animations for sensitive documents' security protection and optimized user browsing experience.
NovelForge
NovelForge is an AI-assisted writing tool with the potential for creating long-form content of millions of words. It offers a solution that combines world-building, structured content generation, and consistency maintenance. The tool is built around four core concepts: modular 'cards', customizable 'dynamic output models', flexible 'context injection', and consistency assurance through a 'knowledge graph'. It provides a highly structured and configurable writing environment, inspired by the Snowflake Method, allowing users to create and organize their content in a tree-like structure. NovelForge is highly customizable and extensible, allowing users to tailor their writing workflow to their specific needs.
MahoShojo-Generator
MahoShojo-Generator is a web-based AI structured generation tool that allows players to create personalized and evolving magical girls (or quirky characters) and related roles. It offers exciting cyber battles, storytelling activities, and even a ranking feature. The project also includes AI multi-channel polling, user system, public data card sharing, and sensitive word detection. It supports various functionalities such as character generation, arena system, growth and social interaction, cloud and sharing, and other features like scenario generation, tavern ecosystem linkage, and content safety measures.
uDesktopMascot
uDesktopMascot is an open-source project for a desktop mascot application with a theme of 'freedom of creation'. It allows users to load and display VRM or GLB/FBX model files on the desktop, customize GUI colors and background images, and access various features through a menu screen. The application supports Windows 10/11 and macOS platforms.
DocTranslator
DocTranslator is a document translation tool that supports various file formats, compatible with OpenAI format API, and offers batch operations and multi-threading support. Whether for individual users or enterprise teams, DocTranslator helps efficiently complete document translation tasks. It supports formats like txt, markdown, word, csv, excel, pdf (non-scanned), and ppt for AI translation. The tool is deployed using Docker for easy setup and usage.
AutoGLM-GUI
AutoGLM-GUI is an AI-driven Android automation productivity tool that supports scheduled tasks, remote deployment, and 24/7 AI assistance. It features core functionalities such as deploying to servers, scheduling tasks, and creating an AI automation assistant. The tool enhances productivity by automating repetitive tasks, managing multiple devices, and providing a layered agent mode for complex task planning and execution. It also supports real-time screen preview, direct device control, and zero-configuration deployment. Users can easily download the tool for Windows, macOS, and Linux systems, and can also install it via Python package. The tool is suitable for various use cases such as server automation, batch device management, development testing, and personal productivity enhancement.
All-Model-Chat
All Model Chat is a feature-rich, highly customizable web chat application designed specifically for the Google Gemini API family. It integrates dynamic model selection, multimodal file input, streaming responses, comprehensive chat history management, and extensive customization options to provide an unparalleled AI interactive experience.
manga-translator-ui
This repository is a manga image translator tool that allows users to translate text in manga images automatically. It supports various types of manga, including Japanese, Korean, and American, in both black and white and color formats. The tool can detect, translate, and embed text, supporting multiple languages such as Japanese, Chinese, and English. It also includes a visual editor for adjusting text boxes. Users can interact with the tool through a Qt interface or command-line mode for batch processing. The tool offers features like intelligent text detection, multi-language OCR, multiple translation engines, high-quality translation using AI models, automatic term extraction, AI sentence segmentation, intelligent typesetting, PSD export, and batch processing. Additionally, it provides a visual editor for region editing, text editing, mask editing, undo/redo functionality, shortcut key support, and mouse wheel shortcuts.
xiaoyaosearch
XiaoyaoSearch is a cross-platform local desktop application designed for knowledge workers, content creators, and developers. It integrates AI models to support various input methods such as voice, text, and image to intelligently search local files. The application is free for non-commercial use, provides source code and development documentation, and ensures privacy by running locally without uploading data to the cloud. It features modern interface design using Electron, Vue 3, and TypeScript.
N.E.K.O
Project N.E.K.O. is an open-source, community-driven platform aiming to build a digital life form that desires to understand, connect, and grow with us. It is a networked empathetic acknowledging organism, a digital life form that seeks to establish connections and grow together with users. The project's ultimate goal is to create an AI-native metaverse closely connected to the real world, with phases including a creative workshop on Steam, an independent platform with derived games, and the N.E.K.O. Network for autonomous social interactions among AI entities. The core features include open-source core components, sustainable ecosystem, and memory synchronization across different scenarios for a seamless companion experience.
LabelQuick
LabelQuick_V2.0 is a fast image annotation tool designed and developed by the AI Horizon team. This version has been optimized and improved based on the previous version. It provides an intuitive interface and powerful annotation and segmentation functions to efficiently complete dataset annotation work. The tool supports video object tracking annotation, quick annotation by clicking, and various video operations. It introduces the SAM2 model for accurate and efficient object detection in video frames, reducing manual intervention and improving annotation quality. The tool is designed for Windows systems and requires a minimum of 6GB of memory.
nekro-agent
Nekro Agent is an AI chat plugin and proxy execution bot that is highly scalable, offers high freedom, and has minimal deployment requirements. It features context-aware chat for group/private chats, custom character settings, sandboxed execution environment, interactive image resource handling, customizable extension development interface, easy deployment with docker-compose, integration with Stable Diffusion for AI drawing capabilities, support for various file types interaction, hot configuration updates and command control, native multimodal understanding, visual application management control panel, CoT (Chain of Thought) support, self-triggered timers and holiday greetings, event notification understanding, and more. It allows for third-party extensions and AI-generated extensions, and includes features like automatic context trigger based on LLM, and a variety of basic commands for bot administrators.
promptMinder
PromptMinder is a professional prompt word management platform that simplifies and enhances AI prompt word management. It features prompt word version control with support for version tracking and history viewing, diff comparison similar to Git for quick identification of prompt word updates, customizable tagging for quick categorization and retrieval, support for private and public prompt words, integration of AI models for intelligent prompt word generation, team collaboration with team creation, member management, and permission control, community contribution feature with audit and publishing process. The platform also offers a responsive design for mobile devices, internationalization support for Chinese and English languages, modern interface based on Shadcn UI, intelligent search and filtering functionality, and convenient copy and share features. It is built for high performance using Next.js 16 + React 19, with security authentication provided by Clerk, reliable storage using Supabase + PostgreSQL database, and easy deployment supporting Vercel and Zeabur one-click deployment.
For similar tasks
TrainPPTAgent
TrainPPTAgent is an AI-based intelligent presentation generation tool. Users can input a topic and the system will automatically generate a well-structured and content-rich PPT outline and page-by-page content. The project adopts a front-end and back-end separation architecture: the front-end is responsible for interaction, outline editing, and template selection, while the back-end leverages large language models (LLM) and reinforcement learning (GRPO) to complete content generation and optimization, making the generated PPT more tailored to user goals.
For similar jobs
sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.
teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.
ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.
classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.
chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.
BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students
uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.
griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.



