ppt-master
AI 驱动的 SVG 演示文稿生成系统,支持 PPT、小红书、朋友圈等多格式 | 15 个示例 | 229 页 | 生成可编辑的 ppt 格式
Stars: 1728
PPT Master is an AI-driven intelligent visual content generation system that converts source documents into high-quality SVG content through multi-role collaboration, supporting various formats such as presentation slides, social media posts, and marketing posters. It provides tools for PDF conversion, SVG post-processing, and PPTX export. Users can interact with AI editors to create content by describing their ideas. The system offers various AI roles for different tasks and provides a comprehensive documentation guide for workflow, design guidelines, canvas formats, image embedding best practices, chart templates, quick references, role definitions, tool usage instructions, example projects, and project workspace structure. Users can contribute to the project by enhancing design templates, chart components, documentation, bug reports, and feature suggestions. The project is open-source under the MIT License.
README:
English | 中文
一个基于 AI 的智能视觉内容生成系统,通过多角色协作,将源文档转化为高质量的 SVG 内容,支持演示文稿、社交媒体、营销海报等多种格式。
🎴 在线示例:GitHub Pages 在线预览 - 查看实际生成效果
本项目需要 Python 3.8+,用于运行 PDF 转换、SVG 后处理、PPTX 导出等工具。
安装 Python:
| 平台 | 推荐安装方式 |
|---|---|
| macOS | 使用 Homebrew:brew install python
|
| Windows | 从 Python 官网 下载安装包 |
| Linux | 使用系统包管理器:sudo apt install python3 python3-pip(Ubuntu/Debian) |
💡 验证安装:运行
python3 --version确认版本 ≥ 3.8
如需使用 web_to_md.cjs 工具(用于微信公众号等高防站点的网页转换),需安装 Node.js。
安装 Node.js:
| 平台 | 推荐安装方式 |
|---|---|
| macOS | 使用 Homebrew:brew install node
|
| Windows | 从 Node.js 官网 下载 LTS 版本安装包 |
| Linux | 使用 NodeSource:curl -fsSL https://deb.nodesource.com/setup_lts.x | sudo -E bash - && sudo apt-get install -y nodejs
|
💡 验证安装:运行
node --version确认版本 ≥ 18
git clone https://github.com/hugohe3/ppt-master.git
cd ppt-master
pip install -r requirements.txt
如遇权限问题,可使用
pip install --user -r requirements.txt或在虚拟环境中安装。
推荐使用以下 AI 编辑器:
| 工具 | 推荐度 | 说明 |
|---|---|---|
| Antigravity | ⭐⭐⭐ | 强烈推荐!免费使用 Opus 4.6,集成 Banana 生图功能,可直接在仓库里生成配图 |
| Cursor | ⭐⭐ | 主流 AI 编辑器,支持多种模型 |
| VS Code + Copilot | ⭐⭐ | 微软官方方案 |
| Claude Code | ⭐⭐ | Anthropic 官方 CLI 工具 |
在 AI 编辑器中打开聊天面板,直接描述你想创作的内容:
用户:我有一份关于 Q3 季度业绩的报告,需要制作成 PPT
AI(Strategist 角色):好的,在开始之前我需要完成八项确认...
1. 画布格式:[建议] PPT 16:9
2. 页数范围:[建议] 8-10 页
...
💡 模型推荐:Opus 4.6 效果最佳,Antigravity 目前可免费使用
💡 AI 迷失上下文? 可提示 AI 参考
AGENTS.md文件,它会自动按照仓库中的角色定义工作
💡 AI 生成图片建议:如需 AI 生成配图,建议在 Gemini 中生成后选择 Download full size 下载,分辨率比 Antigravity 直接生成的更高。Gemini 生成的图片右下角会有星星水印,可使用 gemini-watermark-remover 或本项目的
tools/gemini_watermark_remover.py去除。
| 文档 | 说明 |
|---|---|
| 📖 工作流教程 | 详细的工作流程和案例演示 |
| 🎨 设计指南 | 配色、排版、布局规范详解 |
| 📐 画布格式 | PPT、小红书、朋友圈等 10+ 种格式 |
| 🖼️ 图片嵌入指南 | SVG 图片嵌入最佳实践 |
| 📊 图表模板库 | 13 种标准化图表模板 · 在线预览 |
| ⚡ 快速参考 | 常用命令和参数速查 |
| 🔧 角色定义 | 6 个 AI 角色的完整定义 |
| 🛠️ 工具集 | 所有工具的使用说明 |
| 💼 示例索引 | 15 个项目、229 页 SVG 示例 |
📁 示例库:
examples/· 15 个项目 · 229 页 SVG
| 类别 | 项目 | 页数 | 特色 |
|---|---|---|---|
| 🏢 咨询风格 | 心理治疗中的依恋 | 32 | 顶级咨询风格,最大规模示例 |
| 构建有效AI代理 | 15 | Anthropic 工程博客,AI Agent 架构 | |
| 重庆市区域报告 | 20 | 区域财政分析,企业预警通数据 🆕 | |
| 甘孜州经济财政分析 | 17 | 政务财政分析,藏区文化元素 | |
| 🎨 通用灵活 | Debug 六步法 | 10 | 深色科技风格 |
| 重庆大学论文格式 | 11 | 学术规范指南 | |
| ✨ 创意风格 | 地山谦卦深度研究 | 20 | 易经本体美学,阴阳爻变设计 |
| 金刚经第一品研究 | 15 | 禅意学术,水墨留白 | |
| Git 入门指南 | 10 | 像素复古游戏风 |
📖 查看完整示例文档
用户输入 (PDF/URL/Markdown)
↓
[源内容转换] → pdf_to_md.py / web_to_md.py
↓
[创建项目] → project_manager.py init <项目名> --format <格式>
↓
[模板选项] A) 使用已有模板 B) 不使用模板
↓
[需要新模板?] → 使用 /create-template 工作流单独创建
↓
[Strategist] 策略师 - 八项确认与设计规范
↓
[Image_Generator] 图片生成师(当选择 AI 生成时)
↓
[Executor] 执行师 - 分阶段生成
├── 视觉构建阶段:连续生成所有 SVG 页面 → svg_output/
└── 逻辑构建阶段:生成完整讲稿 → notes/total.md
↓
[后处理] → total_md_split.py(拆分讲稿)→ finalize_svg.py → svg_to_pptx.py
↓
输出: SVG + PPTX(自动嵌入讲稿)
↓
[Optimizer_CRAP] 优化师(可选,初版后不满意再用)
↓
如有优化:重新运行后处理与导出
💡 PPT 编辑提示:导出的 PPTX 页面为 SVG 格式。若需编辑内容,请在 PowerPoint 中选中页面,右键选择 "转换为形状" (Convert to Shape)。此功能需要 Office 2016 或更高版本。
# 初始化项目
python3 tools/project_manager.py init <项目名> --format ppt169
# PDF 转 Markdown
python3 tools/pdf_to_md.py <PDF文件>
# 后处理 SVG
python3 tools/finalize_svg.py <项目路径>
# 导出 PPTX
python3 tools/svg_to_pptx.py <项目路径> -s final
📖 完整工具说明请参阅 工具使用指南
ppt-master/
├── roles/ # AI 角色定义(6 个专业角色)
├── docs/ # 文档中心(教程、设计指南、格式规范等)
├── templates/ # 模板库(图表模板 + 640+ 图标)
├── tools/ # 工具集(项目管理、转换、处理)
├── examples/ # 示例项目(15 个完整案例)
└── projects/ # 用户项目工作区
Q: 生成的 SVG 文件如何使用?
- 直接在浏览器中打开查看
- 使用
svg_to_pptx.py导出为 PowerPoint(需在 PPT 中"转换为形状"以编辑,要求 Office 2016+) - 嵌入到 HTML 页面或使用设计工具编辑
Q: 三种执行师有什么区别?
- Executor_General: 通用场景,灵活布局
- Executor_Consultant: 一般咨询,数据可视化
- Executor_Consultant_Top: 顶级咨询(MBB 级),5 大核心技巧
Q: 必须使用 Optimizer_CRAP 吗?
不是必须的。仅在需要优化关键页面视觉效果时使用。
📖 更多问题请查看 工作流教程
欢迎贡献!
- Fork 本仓库
- 创建分支 (
git checkout -b feature/AmazingFeature) - 提交更改 (
git commit -m 'Add AmazingFeature') - 推送分支 (
git push origin feature/AmazingFeature) - 开启 Pull Request
贡献方向:🎨 设计模板 · 📊 图表组件 · 📝 文档完善 · 🐛 Bug 报告 · 💡 功能建议
本项目采用 MIT License 开源协议。
- SVG Repo - 开源图标库
- Robin Williams - CRAP 设计原则
- 麦肯锡、波士顿咨询、贝恩 - 设计灵感来源
- Issue: GitHub Issues
- GitHub: @hugohe3
如果这个项目对你有帮助,请给一个 ⭐ Star 支持一下!
Made with ❤️ by Hugo He
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for ppt-master
Similar Open Source Tools
ppt-master
PPT Master is an AI-driven intelligent visual content generation system that converts source documents into high-quality SVG content through multi-role collaboration, supporting various formats such as presentation slides, social media posts, and marketing posters. It provides tools for PDF conversion, SVG post-processing, and PPTX export. Users can interact with AI editors to create content by describing their ideas. The system offers various AI roles for different tasks and provides a comprehensive documentation guide for workflow, design guidelines, canvas formats, image embedding best practices, chart templates, quick references, role definitions, tool usage instructions, example projects, and project workspace structure. Users can contribute to the project by enhancing design templates, chart components, documentation, bug reports, and feature suggestions. The project is open-source under the MIT License.
prisma-ai
Prisma-AI is an open-source tool designed to assist users in their job search process by addressing common challenges such as lack of project highlights, mismatched resumes, difficulty in learning, and lack of answers in interview experiences. The tool utilizes AI to analyze user experiences, generate actionable project highlights, customize resumes for specific job positions, provide study materials for efficient learning, and offer structured interview answers. It also features a user-friendly interface for easy deployment and supports continuous improvement through user feedback and collaboration.
nndeploy
nndeploy is a tool that allows you to quickly build your visual AI workflow without the need for frontend technology. It provides ready-to-use algorithm nodes for non-AI programmers, including large language models, Stable Diffusion, object detection, image segmentation, etc. The workflow can be exported as a JSON configuration file, supporting Python/C++ API for direct loading and running, deployment on cloud servers, desktops, mobile devices, edge devices, and more. The framework includes mainstream high-performance inference engines and deep optimization strategies to help you transform your workflow into enterprise-level production applications.
Speech-AI-Forge
Speech-AI-Forge is a project developed around TTS generation models, implementing an API Server and a WebUI based on Gradio. The project offers various ways to experience and deploy Speech-AI-Forge, including online experience on HuggingFace Spaces, one-click launch on Colab, container deployment with Docker, and local deployment. The WebUI features include TTS model functionality, speaker switch for changing voices, style control, long text support with automatic text segmentation, refiner for ChatTTS native text refinement, various tools for voice control and enhancement, support for multiple TTS models, SSML synthesis control, podcast creation tools, voice creation, voice testing, ASR tools, and post-processing tools. The API Server can be launched separately for higher API throughput. The project roadmap includes support for various TTS models, ASR models, voice clone models, and enhancer models. Model downloads can be manually initiated using provided scripts. The project aims to provide inference services and may include training-related functionalities in the future.
Feishu-MCP
Feishu-MCP is a server that provides access, editing, and structured processing capabilities for Feishu documents for Cursor, Windsurf, Cline, and other AI-driven coding tools, based on the Model Context Protocol server. This project enables AI coding tools to directly access and understand the structured content of Feishu documents, significantly improving the intelligence and efficiency of document processing. It covers the real usage process of Feishu documents, allowing efficient utilization of document resources, including folder directory retrieval, content retrieval and understanding, smart creation and editing, efficient search and retrieval, and more. It enhances the intelligent access, editing, and searching of Feishu documents in daily usage, improving content processing efficiency and experience.
HivisionIDPhotos
HivisionIDPhoto is a practical algorithm for intelligent ID photo creation. It utilizes a comprehensive model workflow to recognize, cut out, and generate ID photos for various user photo scenarios. The tool offers lightweight cutting, standard ID photo generation based on different size specifications, six-inch layout photo generation, beauty enhancement (waiting), and intelligent outfit swapping (waiting). It aims to solve emergency ID photo creation issues.
Unity-Skills
UnitySkills is an AI-driven Unity editor automation engine based on REST API. It allows AI to directly control Unity scenes through Skills. The tool offers extreme efficiency with Result Truncation and SKILL.md slimming, a versatile tool library with 282 Skills supporting Batch operations, ensuring transactional safety with automatic rollback, multiple instance support for controlling multiple Unity projects simultaneously, deep integration with Antigravity Slash Commands for interactive experience, compatibility with popular AI terminals like Claude Code, Antigravity, Gemini CLI, and support for Cinemachine 2.x/3.x dual versions with advanced camera control features like MixingCamera, ClearShot, TargetGroup, and Spline.
BlueLM
BlueLM is a large-scale pre-trained language model developed by vivo AI Global Research Institute, featuring 7B base and chat models. It includes high-quality training data with a token scale of 26 trillion, supporting both Chinese and English languages. BlueLM-7B-Chat excels in C-Eval and CMMLU evaluations, providing strong competition among open-source models of similar size. The models support 32K long texts for better context understanding while maintaining base capabilities. BlueLM welcomes developers for academic research and commercial applications.
torch-rechub
Torch-RecHub is a lightweight, efficient, and user-friendly PyTorch recommendation system framework. It provides easy-to-use solutions for industrial-level recommendation systems, with features such as generative recommendation models, modular design for adding new models and datasets, PyTorch-based implementation for GPU acceleration, a rich library of 30+ classic and cutting-edge recommendation algorithms, standardized data loading, training, and evaluation processes, easy configuration through files or command-line parameters, reproducibility of experimental results, ONNX model export for production deployment, cross-engine data processing with PySpark support, and experiment visualization and tracking with integrated tools like WandB, SwanLab, and TensorBoardX.
hello-agents
Hello-Agents is a comprehensive tutorial on building intelligent agent systems, covering both theoretical foundations and practical applications. The tutorial aims to guide users in understanding and building AI-native agents, diving deep into core principles, architectures, and paradigms of intelligent agents. Users will learn to develop their own multi-agent applications from scratch, gaining hands-on experience with popular low-code platforms and agent frameworks. The tutorial also covers advanced topics such as memory systems, context engineering, communication protocols, and model training. By the end of the tutorial, users will have the skills to develop real-world projects like intelligent travel assistants and cyber towns.
torra-community
Torra Community Edition is a modern AI workflow and intelligent agent visualization editor based on Nuxt 4. It offers a lightweight but production-ready architecture with frontend VueFlow + Tailwind v4 + shadcn/ui, backend FeathersJS, and built-in LangChain.js runtime. It supports multiple databases (SQLite/MySQL/MongoDB) and local ↔ cloud hot switching. The tool covers various tasks such as visual workflow editing, modern UI, native integration of LangChain.js, pluggable storage options, full-stack TypeScript implementation, and more. It is designed for enterprises looking for an easy-to-deploy and scalable solution for AI workflows.
md
The WeChat Markdown editor automatically renders Markdown documents as WeChat articles, eliminating the need to worry about WeChat content layout! As long as you know basic Markdown syntax (now with AI, you don't even need to know Markdown), you can create a simple and elegant WeChat article. The editor supports all basic Markdown syntax, mathematical formulas, rendering of Mermaid charts, GFM warning blocks, PlantUML rendering support, ruby annotation extension support, rich code block highlighting themes, custom theme colors and CSS styles, multiple image upload functionality with customizable configuration of image hosting services, convenient file import/export functionality, built-in local content management with automatic draft saving, integration of mainstream AI models (such as DeepSeek, OpenAI, Tongyi Qianwen, Tencent Hanyuan, Volcano Ark, etc.) to assist content creation.
jimeng-free-api-all
Jimeng AI Free API is a reverse-engineered API server that encapsulates Jimeng AI's image and video generation capabilities into OpenAI-compatible API interfaces. It supports the latest jimeng-5.0-preview, jimeng-4.6 text-to-image models, Seedance 2.0 multi-image intelligent video generation, zero-configuration deployment, and multi-token support. The API is fully compatible with OpenAI API format, seamlessly integrating with existing clients and supporting multiple session IDs for polling usage.
LunaBox
LunaBox is a lightweight, fast, and feature-rich tool for managing and tracking visual novels, with the ability to customize game categories, automatically track playtime, generate personalized reports through AI analysis, import data from other platforms, backup data locally or on cloud services, and ensure privacy and security by storing sensitive data locally. The tool supports multi-dimensional statistics, offers a variety of customization options, and provides a user-friendly interface for easy navigation and usage.
ChatGPT-Next-Web-Pro
ChatGPT-Next-Web-Pro is a tool that provides an enhanced version of ChatGPT-Next-Web with additional features and functionalities. It offers complete ChatGPT-Next-Web functionality, file uploading and storage capabilities, drawing and video support, multi-modal support, reverse model support, knowledge base integration, translation, customizations, and more. The tool can be deployed with or without a backend, allowing users to interact with AI models, manage accounts, create models, manage API keys, handle orders, manage memberships, and more. It supports various cloud services like Aliyun OSS, Tencent COS, and Minio for file storage, and integrates with external APIs like Azure, Google Gemini Pro, and Luma. The tool also provides options for customizing website titles, subtitles, icons, and plugin buttons, and offers features like voice input, file uploading, real-time token count display, and more.
DeepAudit
DeepAudit is an AI audit team accessible to everyone, making vulnerability discovery within reach. It is a next-generation code security audit platform based on Multi-Agent collaborative architecture. It simulates the thinking mode of security experts, achieving deep code understanding, vulnerability discovery, and automated sandbox PoC verification through multiple intelligent agents (Orchestrator, Recon, Analysis, Verification). DeepAudit aims to address the three major pain points of traditional SAST tools: high false positive rate, blind spots in business logic, and lack of verification means. Users only need to import the project, and DeepAudit automatically starts working: identifying the technology stack, analyzing potential risks, generating scripts, sandbox verification, and generating reports, ultimately outputting a professional audit report. The core concept is to let AI attack like a hacker and defend like an expert.
For similar tasks
ppt-master
PPT Master is an AI-driven intelligent visual content generation system that converts source documents into high-quality SVG content through multi-role collaboration, supporting various formats such as presentation slides, social media posts, and marketing posters. It provides tools for PDF conversion, SVG post-processing, and PPTX export. Users can interact with AI editors to create content by describing their ideas. The system offers various AI roles for different tasks and provides a comprehensive documentation guide for workflow, design guidelines, canvas formats, image embedding best practices, chart templates, quick references, role definitions, tool usage instructions, example projects, and project workspace structure. Users can contribute to the project by enhancing design templates, chart components, documentation, bug reports, and feature suggestions. The project is open-source under the MIT License.
For similar jobs
ppt-master
PPT Master is an AI-driven intelligent visual content generation system that converts source documents into high-quality SVG content through multi-role collaboration, supporting various formats such as presentation slides, social media posts, and marketing posters. It provides tools for PDF conversion, SVG post-processing, and PPTX export. Users can interact with AI editors to create content by describing their ideas. The system offers various AI roles for different tasks and provides a comprehensive documentation guide for workflow, design guidelines, canvas formats, image embedding best practices, chart templates, quick references, role definitions, tool usage instructions, example projects, and project workspace structure. Users can contribute to the project by enhancing design templates, chart components, documentation, bug reports, and feature suggestions. The project is open-source under the MIT License.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
daily-poetry-image
Daily Chinese ancient poetry and AI-generated images powered by Bing DALL-E-3. GitHub Action triggers the process automatically. Poetry is provided by Today's Poem API. The website is built with Astro.
exif-photo-blog
EXIF Photo Blog is a full-stack photo blog application built with Next.js, Vercel, and Postgres. It features built-in authentication, photo upload with EXIF extraction, photo organization by tag, infinite scroll, light/dark mode, automatic OG image generation, a CMD-K menu with photo search, experimental support for AI-generated descriptions, and support for Fujifilm simulations. The application is easy to deploy to Vercel with just a few clicks and can be customized with a variety of environment variables.
SillyTavern
SillyTavern is a user interface you can install on your computer (and Android phones) that allows you to interact with text generation AIs and chat/roleplay with characters you or the community create. SillyTavern is a fork of TavernAI 1.2.8 which is under more active development and has added many major features. At this point, they can be thought of as completely independent programs.
Twitter-Insight-LLM
This project enables you to fetch liked tweets from Twitter (using Selenium), save it to JSON and Excel files, and perform initial data analysis and image captions. This is part of the initial steps for a larger personal project involving Large Language Models (LLMs).
AISuperDomain
Aila Desktop Application is a powerful tool that integrates multiple leading AI models into a single desktop application. It allows users to interact with various AI models simultaneously, providing diverse responses and insights to their inquiries. With its user-friendly interface and customizable features, Aila empowers users to engage with AI seamlessly and efficiently. Whether you're a researcher, student, or professional, Aila can enhance your AI interactions and streamline your workflow.
ChatGPT-On-CS
This project is an intelligent dialogue customer service tool based on a large model, which supports access to platforms such as WeChat, Qianniu, Bilibili, Douyin Enterprise, Douyin, Doudian, Weibo chat, Xiaohongshu professional account operation, Xiaohongshu, Zhihu, etc. You can choose GPT3.5/GPT4.0/ Lazy Treasure Box (more platforms will be supported in the future), which can process text, voice and pictures, and access external resources such as operating systems and the Internet through plug-ins, and support enterprise AI applications customized based on their own knowledge base.