DeepAudit

DeepAudit：人人拥有的 AI 黑客战队，让漏洞挖掘触手可及。国内首个开源的代码漏洞挖掘多智能体系统。小白一键部署运行，自主协作审计 + 自动化沙箱 PoC 验证。支持 Ollama 私有部署，一键生成报告。支持中转站。让安全不再昂贵，让审计不再复杂。

Stars: 4589

Visit

DeepAudit is an AI audit team accessible to everyone, making vulnerability discovery within reach. It is a next-generation code security audit platform based on Multi-Agent collaborative architecture. It simulates the thinking mode of security experts, achieving deep code understanding, vulnerability discovery, and automated sandbox PoC verification through multiple intelligent agents (Orchestrator, Recon, Analysis, Verification). DeepAudit aims to address the three major pain points of traditional SAST tools: high false positive rate, blind spots in business logic, and lack of verification means. Users only need to import the project, and DeepAudit automatically starts working: identifying the technology stack, analyzing potential risks, generating scripts, sandbox verification, and generating reports, ultimately outputting a professional audit report. The core concept is to let AI attack like a hacker and defend like an expert.

README:

DeepAudit - 人人拥有的 AI 审计战队，让漏洞挖掘触手可及 🦸‍♂️

简体中文 | English

📸 界面预览

🤖 Agent 审计入口

首页快速进入 Multi-Agent 深度审计

📋 审计流日志实时查看 Agent 思考与执行过程	🎛️ 智能仪表盘一眼掌握项目安全态势
⚡ 即时分析粘贴代码 / 上传文件，秒出结果	🗂️ 项目管理 GitHub/GitLab/Gitea 导入，多项目协同管理

📊 专业报告

一键导出 PDF / Markdown / JSON（图中为快速模式，非Agent模式报告）

👉 查看Agent审计完整报告示例

🏆 CVE 漏洞发现

DeepAudit 已成功发现并获得 49 个 CVE 编号，涉及 16 个知名开源项目

CVE 编号	项目	漏洞类型	CVSS
CVE-2026-1884	Zentao PMS	SSRF	5.1
CVE-2025-13789	Zentao PMS	SSRF	5.3
CVE-2025-13787	Zentao PMS	Privilege Escalation	9.1
CVE-2025-64428	Dataease	JNDI Injection	9.8
CVE-2025-13246	Modulithshop	SQL Injection	6.3
CVE-2025-64163	Dataease	SSRF	9.8
CVE-2025-64164	Dataease	JNDI Injection	9.8
CVE-2025-11581	PowerJob	Privilege Escalation	7.5
CVE-2025-11580	PowerJob	Privilege Escalation	5.3
CVE-2025-10771	Jimureport	Deserialization	9.8
CVE-2025-10770	Jimureport	Deserialization	6.5
CVE-2025-10769	H2o-3	Deserialization	9.8
CVE-2025-10768	H2o-3	Deserialization	9.8
CVE-2025-58045	Dataease	JNDI Injection	9.8
CVE-2025-10423	Newbee-mall	Guessable Captcha	3.7
CVE-2025-10422	Newbee-mall	Privilege Escalation	4.3
CVE-2025-9835	Mall	Privilege Escalation	4.3
CVE-2025-9737	O2oa	XSS	5.4
CVE-2025-9736	O2oa	XSS	5.4
CVE-2025-9735	O2oa	XSS	5.4
CVE-2025-9734	O2oa	XSS	5.4
CVE-2025-9719	O2oa	XSS	5.4
CVE-2025-9718	O2oa	XSS	5.4
CVE-2025-9717	O2oa	XSS	5.4
CVE-2025-9716	O2oa	XSS	5.4
CVE-2025-9715	O2oa	XSS	5.4
CVE-2025-9683	O2oa	XSS	5.4
CVE-2025-9682	O2oa	XSS	5.4
CVE-2025-9681	O2oa	XSS	5.4
CVE-2025-9680	O2oa	XSS	5.4
CVE-2025-9659	O2oa	XSS	5.4
CVE-2025-9658	O2oa	XSS	5.4
CVE-2025-9657	O2oa	XSS	5.4
CVE-2025-9655	O2oa	XSS	5.4
CVE-2025-9646	O2oa	XSS	5.4
CVE-2025-9602	RockOA	Database Backdoor	6.5
CVE-2025-9514	Mall	Privilege Escalation	3.7
CVE-2025-9264	Xxl-job	Privilege Escalation	5.4
CVE-2025-9263	Xxl-job	Privilege Escalation	4.3
CVE-2025-9241	Eladmin	CSV/XLSX Injection	7.5
CVE-2025-9240	Eladmin	Sensitive Information Disclosure	4.3
CVE-2025-9239	Eladmin	Hardcoded Credentials	3.7
CVE-2025-8974	Litemall	Hardcoded Credentials	9.8
CVE-2025-8852	Wukong CRM	Sensitive Information Disclosure	4.3
CVE-2025-8840	Jsherp	Privilege Escalation	5.4
CVE-2025-8839	Jsherp	Privilege Escalation	8.8
CVE-2025-8764	Litemall	XSS	5.4
CVE-2025-8753	Litemall	Arbitrary File Deletion	5.4
CVE-2025-8708	White-Jotter	Deserialization	7.5

👉 查看完整 CVE 列表详情

以上漏洞由 DeepAudit 团队成员 @ez-lbz 使用 DeepAudit 挖掘发现

如果您使用 DeepAudit 发现了漏洞，欢迎在 Issues 中留言反馈。您的贡献将极大地丰富这份漏洞列表，非常感谢！

⚡ 项目概述

DeepAudit 是一个基于 Multi-Agent 协作架构的下一代代码安全审计平台。它不仅仅是一个静态扫描工具，而是模拟安全专家的思维模式，通过多个智能体（Orchestrator, Recon, Analysis, Verification）的自主协作，实现对代码的深度理解、漏洞挖掘和 自动化沙箱 PoC 验证。

我们致力于解决传统 SAST 工具的三大痛点：

误报率高 — 缺乏语义理解，大量误报消耗人力
业务逻辑盲点 — 无法理解跨文件调用和复杂逻辑
缺乏验证手段 — 不知道漏洞是否真实可利用

用户只需导入项目，DeepAudit 便全自动开始工作：识别技术栈 → 分析潜在风险 → 生成脚本 → 沙箱验证 → 生成报告，最终输出一份专业审计报告。

核心理念: 让 AI 像黑客一样攻击，像专家一样防御。

💡 为什么选择 DeepAudit？

😫 传统审计的痛点	💡 DeepAudit 解决方案
人工审计效率低跨不上 CI/CD 代码迭代速度，拖慢发布流程	🤖 Multi-Agent 自主审计 AI 自动编排审计策略，全天候自动化执行
传统工具误报多缺乏语义理解，每天花费大量时间清洗噪音	🧠 RAG 知识库增强结合代码语义与上下文，大幅降低误报率
数据隐私担忧担心核心源码泄露给云端 AI，无法满足合规要求	🔒 支持 Ollama 本地部署数据不出内网，支持 Llama3/DeepSeek 等本地模型
无法确认真实性外包项目漏洞多，不知道哪些漏洞真实可被利用	💥 沙箱 PoC 验证自动生成并执行攻击脚本，确认漏洞真实危害

🏗️ 系统架构

整体架构图

DeepAudit 采用微服务架构，核心由 Multi-Agent 引擎驱动。

🔄 审计工作流

步骤	阶段	负责 Agent	主要动作
1	策略规划	Orchestrator	接收审计任务，分析项目类型，制定审计计划，下发任务给子 Agent
2	信息收集	Recon Agent	扫描项目结构，识别框架/库/API，提取攻击面（Entry Points）
3	漏洞挖掘	Analysis Agent	结合 RAG 知识库与 AST 分析，深度审查代码，发现潜在漏洞
4	PoC 验证	Verification Agent	(关键) 编写 PoC 脚本，在 Docker 沙箱中执行。如失败则自我修正重试
5	报告生成	Orchestrator	汇总所有发现，剔除被验证为误报的漏洞，生成最终报告

📂 项目代码结构

DeepAudit/
├── backend/                        # Python FastAPI 后端
│   ├── app/
│   │   ├── agents/                 # Multi-Agent 核心逻辑
│   │   │   ├── orchestrator.py     # 总指挥：任务编排
│   │   │   ├── recon.py            # 侦察兵：资产识别
│   │   │   ├── analysis.py         # 分析师：漏洞挖掘
│   │   │   └── verification.py     # 验证者：沙箱 PoC
│   │   ├── core/                   # 核心配置与沙箱接口
│   │   ├── models/                 # 数据库模型
│   │   └── services/               # RAG, LLM 服务封装
│   └── tests/                      # 单元测试
├── frontend/                       # React + TypeScript 前端
│   ├── src/
│   │   ├── components/             # UI 组件库
│   │   ├── pages/                  # 页面路由
│   │   └── stores/                 # Zustand 状态管理
├── docker/                         # Docker 部署配置
│   ├── sandbox/                    # 安全沙箱镜像构建
│   └── postgres/                   # 数据库初始化
└── docs/                           # 详细文档

🚀 快速开始

方式一：一行命令部署（推荐）

使用预构建的 Docker 镜像，无需克隆代码，一行命令即可启动：

curl -fsSL https://raw.githubusercontent.com/lintsinghua/DeepAudit/v3.0.0/docker-compose.prod.yml | docker compose -f - up -d

🇨🇳 国内加速部署（作者亲测非常无敌之快）

使用南京大学镜像站加速拉取 Docker 镜像（将 ghcr.io 替换为 ghcr.nju.edu.cn）：

# 国内加速版 - 使用南京大学 GHCR 镜像站
curl -fsSL https://raw.githubusercontent.com/lintsinghua/DeepAudit/v3.0.0/docker-compose.prod.cn.yml | docker compose -f - up -d

手动拉取镜像（如需单独拉取）（点击展开）

# 前端镜像
docker pull ghcr.nju.edu.cn/lintsinghua/deepaudit-frontend:latest

# 后端镜像
docker pull ghcr.nju.edu.cn/lintsinghua/deepaudit-backend:latest

# 沙箱镜像
docker pull ghcr.nju.edu.cn/lintsinghua/deepaudit-sandbox:latest

💡 镜像源由南京大学开源镜像站提供支持

💡 配置 Docker 镜像加速（可选，进一步提升拉取速度）（点击展开）

如果拉取镜像仍然较慢，可以配置 Docker 镜像加速器。编辑 Docker 配置文件并添加以下镜像源：

Linux / macOS：编辑 /etc/docker/daemon.json

Windows：右键 Docker Desktop 图标 → Settings → Docker Engine

{
  "registry-mirrors": [
    "https://docker.1ms.run",
    "https://dockerproxy.com",
    "https://hub.rat.dev"
  ]
}

保存后重启 Docker 服务：

# Linux
sudo systemctl restart docker

# macOS / Windows
# 重启 Docker Desktop 应用

🎉 启动成功！ 访问 http://localhost:3000 开始体验。

方式二：克隆代码部署

适合需要自定义配置或二次开发的用户：

# 1. 克隆项目
git clone https://github.com/lintsinghua/DeepAudit.git && cd DeepAudit

# 2. 配置环境变量
cp backend/env.example backend/.env
# 编辑 backend/.env 填入你的 LLM API Key

# 3. 一键启动
docker compose up -d

首次启动会自动构建沙箱镜像，可能需要几分钟。

🔧 源码开发指南

适合开发者进行二次开发调试。

环境要求

Python 3.11+
Node.js 20+
PostgreSQL 15+
Docker (用于沙箱)

1. 手动启动数据库

docker compose up -d redis db adminer

2. 后端启动

cd backend
# 配置环境
cp env.example .env

# 使用 uv 管理环境（推荐）
uv sync
source .venv/bin/activate

# 启动 API 服务
uvicorn app.main:app --reload

3. 前端启动

cd frontend
# 配置环境
cp .env.example .env

pnpm install
pnpm dev

3. 沙箱环境

开发模式下需要本地 Docker 拉取沙箱镜像：

# 标准拉取
docker pull ghcr.io/lintsinghua/deepaudit-sandbox:latest

# 国内加速（南京大学镜像站）
docker pull ghcr.nju.edu.cn/lintsinghua/deepaudit-sandbox:latest

🤖 Multi-Agent 智能审计

支持的漏洞类型

漏洞类型	描述
`sql_injection`	SQL 注入
`xss`	跨站脚本攻击
`command_injection`	命令注入
`path_traversal`	路径遍历
`ssrf`	服务端请求伪造
`xxe`	XML 外部实体注入

漏洞类型	描述
`insecure_deserialization`	不安全反序列化
`hardcoded_secret`	硬编码密钥
`weak_crypto`	弱加密算法
`authentication_bypass`	认证绕过
`authorization_bypass`	授权绕过
`idor`	不安全直接对象引用

📖 详细文档请查看 Agent 审计指南

🔌 支持的 LLM 平台

🌍 国际平台

OpenAI GPT-4o / GPT-4
Claude 3.5 Sonnet / Opus
Google Gemini Pro
DeepSeek V3

🇨🇳 国内平台

通义千问 Qwen
智谱 GLM-4
Moonshot Kimi
文心一言 · MiniMax · 豆包

🏠 本地部署

Ollama
Llama3 · Qwen2.5 · CodeLlama
DeepSeek-Coder · Codestral
代码不出内网

💡 支持 API 中转站，解决网络访问问题 | 详细配置 → LLM 平台支持

🎯 功能矩阵

功能	说明	模式
🤖 Agent 深度审计	Multi-Agent 协作，自主编排审计策略	Agent
🧠 RAG 知识增强	代码语义理解，CWE/CVE 知识库检索	Agent
🔒 沙箱 PoC 验证	Docker 隔离执行，验证漏洞有效性	Agent
🗂️ 项目管理	GitHub/GitLab/Gitea 导入，ZIP 上传，10+ 语言支持	通用
⚡ 即时分析	代码片段秒级分析，粘贴即用	通用
🔍 五维检测	Bug · 安全 · 性能 · 风格 · 可维护性	通用
💡 What-Why-How	精准定位 + 原因解释 + 修复建议	通用
📋 审计规则	内置 OWASP Top 10，支持自定义规则集	通用
📝 提示词模板	可视化管理，支持中英文双语	通用
📊 报告导出	PDF / Markdown / JSON 一键导出	通用
⚙️ 运行时配置	浏览器配置 LLM，无需重启服务	通用

🦖 发展路线图

我们正在持续演进，未来将支持更多语言和更强大的 Agent 能力。

[x] 基础静态分析，集成 Semgrep
[x] 引入 RAG 知识库，支持 Docker 安全沙箱
[x] Multi-Agent 协作架构 (Current)
[ ] 支持更真实的模拟服务环境，进行更真实漏洞验证流程
[ ] 沙箱从function_call优化集成为稳定MCP服务
[ ] 自动修复 (Auto-Fix): Agent 直接提交 PR 修复漏洞
[ ] 增量PR审计: 持续跟踪 PR 变更，智能分析漏洞，并集成CI/CD流程
[ ] 优化RAG: 支持自定义知识库

🤝 贡献与社区

贡献指南

我们非常欢迎您的贡献！无论是提交 Issue、PR 还是完善文档。请查看 CONTRIBUTING.md 了解详情。

📬 联系作者

欢迎大家来和我交流探讨！无论是技术问题、功能建议还是合作意向，都期待与你沟通~ （平台定制、代码审计服务、技术咨询、合作洽谈等请通过邮箱联系）

联系方式
📧 邮箱	[email protected]
🐙 GitHub	@lintsinghua

📄 许可证

本项目采用 AGPL-3.0 License 开源。

📈 项目热度

Made with ❤️ by lintsinghua

致谢

感谢以下开源项目的支持：

FastAPI · LangChain · LangGraph · ChromaDB · LiteLLM · Tree-sitter · Kunlun-M · Strix · React · Vite · Radix UI · TailwindCSS · shadcn/ui

⚠️ 重要安全声明

法律合规声明

禁止任何未经授权的漏洞测试、渗透测试或安全评估
本项目仅供网络空间安全学术研究、教学和学习使用
严禁将本项目用于任何非法目的或未经授权的安全测试

漏洞上报责任

发现任何安全漏洞时，请及时通过合法渠道上报
严禁利用发现的漏洞进行非法活动
遵守国家网络安全法律法规，维护网络空间安全

使用限制

仅限在授权环境下用于教育和研究目的
禁止用于对未授权系统进行安全测试
使用者需对自身行为承担全部法律责任

免责声明

作者不对任何因使用本项目而导致的直接或间接损失负责，使用者需对自身行为承担全部法律责任。

📖 详细安全政策

有关安装政策、免责声明、代码隐私、API使用安全和漏洞报告的详细信息，请参阅 DISCLAIMER.md 和 SECURITY.md 文件。

快速参考

代码隐私警告: 您的代码将被发送到所选择的LLM服务商服务器
敏感代码处理: 使用本地模型处理敏感代码
合规要求: 遵守数据保护和隐私法律法规
漏洞报告: 发现安全问题请通过合法渠道上报

For Tasks:

Click tags to check more tools for each tasks

automate code security audits discover vulnerabilities generate audit reports simulate security expert thinking verify vulnerabilities with sandbox poc

For Jobs:

security analyst software engineer penetration tester security consultant ai auditor

Alternative AI tools for DeepAudit

Similar Open Source Tools

DeepAudit

github

: 4.6k

MedicalGPT

MedicalGPT is a training medical GPT model with ChatGPT training pipeline, implement of Pretraining, Supervised Finetuning, RLHF(Reward Modeling and Reinforcement Learning) and DPO(Direct Preference Optimization).

github

: 3.6k

jimeng-free-api-all

Jimeng AI Free API is a reverse-engineered API server that encapsulates Jimeng AI's image and video generation capabilities into OpenAI-compatible API interfaces. It supports the latest jimeng-5.0-preview, jimeng-4.6 text-to-image models, Seedance 2.0 multi-image intelligent video generation, zero-configuration deployment, and multi-token support. The API is fully compatible with OpenAI API format, seamlessly integrating with existing clients and supporting multiple session IDs for polling usage.

github

: 263

AstrBot

AstrBot is a powerful and versatile tool that leverages the capabilities of large language models (LLMs) like GPT-3, GPT-3.5, and GPT-4 to enhance communication and automate tasks. It seamlessly integrates with popular messaging platforms such as QQ, QQ Channel, and Telegram, enabling users to harness the power of AI within their daily conversations and workflows.

github

: 6.6k

gpt_server

The GPT Server project leverages the basic capabilities of FastChat to provide the capabilities of an openai server. It perfectly adapts more models, optimizes models with poor compatibility in FastChat, and supports loading vllm, LMDeploy, and hf in various ways. It also supports all sentence_transformers compatible semantic vector models, including Chat templates with function roles, Function Calling (Tools) capability, and multi-modal large models. The project aims to reduce the difficulty of model adaptation and project usage, making it easier to deploy the latest models with minimal code changes.

github

: 211

Qbot

Qbot is an AI-oriented automated quantitative investment platform that supports diverse machine learning modeling paradigms, including supervised learning, market dynamics modeling, and reinforcement learning. It provides a full closed-loop process from data acquisition, strategy development, backtesting, simulation trading to live trading. The platform emphasizes AI strategies such as machine learning, reinforcement learning, and deep learning, combined with multi-factor models to enhance returns. Users with some Python knowledge and trading experience can easily utilize the platform to address trading pain points and gaps in the market.

github

: 7.0k

HivisionIDPhotos

HivisionIDPhoto is a practical algorithm for intelligent ID photo creation. It utilizes a comprehensive model workflow to recognize, cut out, and generate ID photos for various user photo scenarios. The tool offers lightweight cutting, standard ID photo generation based on different size specifications, six-inch layout photo generation, beauty enhancement (waiting), and intelligent outfit swapping (waiting). It aims to solve emergency ID photo creation issues.

github

: 10.3k

MindChat

MindChat is a psychological large language model designed to help individuals relieve psychological stress and solve mental confusion, ultimately improving mental health. It aims to provide a relaxed and open conversation environment for users to build trust and understanding. MindChat offers privacy, warmth, safety, timely, and convenient conversation settings to help users overcome difficulties and challenges, achieve self-growth, and development. The tool is suitable for both work and personal life scenarios, providing comprehensive psychological support and therapeutic assistance to users while strictly protecting user privacy. It combines psychological knowledge with artificial intelligence technology to contribute to a healthier, more inclusive, and equal society.

github

: 436

hello-agents

Hello-Agents is a comprehensive tutorial on building intelligent agent systems, covering both theoretical foundations and practical applications. The tutorial aims to guide users in understanding and building AI-native agents, diving deep into core principles, architectures, and paradigms of intelligent agents. Users will learn to develop their own multi-agent applications from scratch, gaining hands-on experience with popular low-code platforms and agent frameworks. The tutorial also covers advanced topics such as memory systems, context engineering, communication protocols, and model training. By the end of the tutorial, users will have the skills to develop real-world projects like intelligent travel assistants and cyber towns.

github

: 19.9k

Firefly

Firefly is an open-source large model training project that supports pre-training, fine-tuning, and DPO of mainstream large models. It includes models like Llama3, Gemma, Qwen1.5, MiniCPM, Llama, InternLM, Baichuan, ChatGLM, Yi, Deepseek, Qwen, Orion, Ziya, Xverse, Mistral, Mixtral-8x7B, Zephyr, Vicuna, Bloom, etc. The project supports full-parameter training, LoRA, QLoRA efficient training, and various tasks such as pre-training, SFT, and DPO. Suitable for users with limited training resources, QLoRA is recommended for fine-tuning instructions. The project has achieved good results on the Open LLM Leaderboard with QLoRA training process validation. The latest version has significant updates and adaptations for different chat model templates.

github

: 4.8k

GodHook

GodHook is an Xposed module that integrates various fun features, including automatic replies with support for multiple AI language models, subscription functionality for daily news, inspirational quotes, and weather updates, as well as interface functions to execute host app message functions for operations alerts and data push scenarios. It also offers various other features waiting to be explored. The module is designed for learning and communication purposes only and should not be used for malicious purposes. It requires technical knowledge to configure API model information and aims to lower the technical barrier for wider usage in the future.

github

: 110

JiwuChat

JiwuChat is a lightweight multi-platform chat application built on Tauri2 and Nuxt3, with various real-time messaging features, AI group chat bots (such as 'iFlytek Spark', 'KimiAI' etc.), WebRTC audio-video calling, screen sharing, and AI shopping functions. It supports seamless cross-device communication, covering text, images, files, and voice messages, also supporting group chats and customizable settings. It provides light/dark mode for efficient social networking.

github

: 627

petercat

Peter Cat is an intelligent Q&A chatbot solution designed for community maintainers and developers. It provides a conversational Q&A agent configuration system, self-hosting deployment solutions, and a convenient integrated application SDK. Users can easily create intelligent Q&A chatbots for their GitHub repositories and quickly integrate them into various official websites or projects to provide more efficient technical support for the community.

github

: 1.3k

md

The WeChat Markdown editor automatically renders Markdown documents as WeChat articles, eliminating the need to worry about WeChat content layout! As long as you know basic Markdown syntax (now with AI, you don't even need to know Markdown), you can create a simple and elegant WeChat article. The editor supports all basic Markdown syntax, mathematical formulas, rendering of Mermaid charts, GFM warning blocks, PlantUML rendering support, ruby annotation extension support, rich code block highlighting themes, custom theme colors and CSS styles, multiple image upload functionality with customizable configuration of image hosting services, convenient file import/export functionality, built-in local content management with automatic draft saving, integration of mainstream AI models (such as DeepSeek, OpenAI, Tongyi Qianwen, Tencent Hanyuan, Volcano Ark, etc.) to assist content creation.

github

: 10.7k

lingti-bot

lingti-bot is an AI Bot platform that integrates MCP Server, multi-platform message gateway, rich toolset, intelligent conversation, and voice interaction. It offers core advantages like zero-dependency deployment with a single 30MB binary file, cloud relay support for quick integration with enterprise WeChat/WeChat Official Account, built-in browser automation with CDP protocol control, 75+ MCP tools covering various scenarios, native support for Chinese platforms like DingTalk, Feishu, enterprise WeChat, WeChat Official Account, and more. It is embeddable, supports multiple AI backends like Claude, DeepSeek, Kimi, MiniMax, and Gemini, and allows access from platforms like DingTalk, Feishu, enterprise WeChat, WeChat Official Account, Slack, Telegram, and Discord. The bot is designed with simplicity as the highest design principle, focusing on zero-dependency deployment, embeddability, plain text output, code restraint, and cloud relay support.

github

: 67

MiniCPM

MiniCPM is a series of open-source large models on the client side jointly developed by Face Intelligence and Tsinghua University Natural Language Processing Laboratory. The main language model MiniCPM-2B has only 2.4 billion (2.4B) non-word embedding parameters, with a total of 2.7B parameters. - After SFT, MiniCPM-2B performs similarly to Mistral-7B on public comprehensive evaluation sets (better in Chinese, mathematics, and code capabilities), and outperforms models such as Llama2-13B, MPT-30B, and Falcon-40B overall. - After DPO, MiniCPM-2B also surpasses many representative open-source large models such as Llama2-70B-Chat, Vicuna-33B, Mistral-7B-Instruct-v0.1, and Zephyr-7B-alpha on the current evaluation set MTBench, which is closest to the user experience. - Based on MiniCPM-2B, a multi-modal large model MiniCPM-V 2.0 on the client side is constructed, which achieves the best performance of models below 7B in multiple test benchmarks, and surpasses larger parameter scale models such as Qwen-VL-Chat 9.6B, CogVLM-Chat 17.4B, and Yi-VL 34B on the OpenCompass leaderboard. MiniCPM-V 2.0 also demonstrates leading OCR capabilities, approaching Gemini Pro in scene text recognition capabilities. - After Int4 quantization, MiniCPM can be deployed and inferred on mobile phones, with a streaming output speed slightly higher than human speech speed. MiniCPM-V also directly runs through the deployment of multi-modal large models on mobile phones. - A single 1080/2080 can efficiently fine-tune parameters, and a single 3090/4090 can fully fine-tune parameters. A single machine can continuously train MiniCPM, and the secondary development cost is relatively low.

github

: 8.3k

For similar tasks

DeepAudit

github

: 4.6k

LLM-FuzzX

LLM-FuzzX is an open-source user-friendly fuzz testing tool for large language models (e.g., GPT, Claude, LLaMA), equipped with advanced task-aware mutation strategies, fine-grained evaluation, and jailbreak detection capabilities. It helps researchers and developers quickly discover potential security vulnerabilities and enhance model robustness. The tool features a user-friendly web interface for visual configuration and real-time monitoring, supports various advanced mutation methods, integrates RoBERTa model for real-time jailbreak detection and evaluation, supports multiple language models like GPT, Claude, LLaMA, provides visualization analysis with seed flowcharts and experiment data statistics, and offers detailed logging support for main, mutation, and jailbreak logs.

github

: 108

hexstrike-ai

HexStrike AI is an advanced AI-powered penetration testing MCP framework with 150+ security tools and 12+ autonomous AI agents. It features a multi-agent architecture with intelligent decision-making, vulnerability intelligence, and modern visual engine. The platform allows for AI agent connection, intelligent analysis, autonomous execution, real-time adaptation, and advanced reporting. HexStrike AI offers a streamlined installation process, Docker container support, 250+ specialized AI agents/tools, native desktop client, advanced web automation, memory optimization, enhanced error handling, and bypassing limitations.

github

: 3.2k

For similar jobs

holisticai

Holistic AI is an open-source library dedicated to assessing and improving the trustworthiness of AI systems. It focuses on measuring and mitigating bias, explainability, robustness, security, and efficacy in AI models. The tool provides comprehensive metrics, mitigation techniques, a user-friendly interface, and visualization tools to enhance AI system trustworthiness. It offers documentation, tutorials, and detailed installation instructions for easy integration into existing workflows.

github

: 69

DeepAudit

github

: 4.6k

hackingBuddyGPT

hackingBuddyGPT is a framework for testing LLM-based agents for security testing. It aims to create common ground truth by creating common security testbeds and benchmarks, evaluating multiple LLMs and techniques against those, and publishing prototypes and findings as open-source/open-access reports. The initial focus is on evaluating the efficiency of LLMs for Linux privilege escalation attacks, but the framework is being expanded to evaluate the use of LLMs for web penetration-testing and web API testing. hackingBuddyGPT is released as open-source to level the playing field for blue teams against APTs that have access to more sophisticated resources.

github

: 374

aio-proxy

This script automates setting up TUIC, hysteria and other proxy-related tools in Linux. It features setting domains, getting SSL certification, setting up a simple web page, SmartSNI by Bepass, Chisel Tunnel, Hysteria V2, Tuic, Hiddify Reality Scanner, SSH, Telegram Proxy, Reverse TLS Tunnel, different panels, installing, disabling, and enabling Warp, Sing Box 4-in-1 script, showing ports in use and their corresponding processes, and an Android script to use Chisel tunnel.

github

: 274

aircrackauto

AirCrackAuto is a tool that automates the aircrack-ng process for Wi-Fi hacking. It is designed to make it easier for users to crack Wi-Fi passwords by automating the process of capturing packets, generating wordlists, and launching attacks. AirCrackAuto is a powerful tool that can be used to crack Wi-Fi passwords in a matter of minutes.

github

: 79

awesome-gpt-security

Awesome GPT + Security is a curated list of awesome security tools, experimental case or other interesting things with LLM or GPT. It includes tools for integrated security, auditing, reconnaissance, offensive security, detecting security issues, preventing security breaches, social engineering, reverse engineering, investigating security incidents, fixing security vulnerabilities, assessing security posture, and more. The list also includes experimental cases, academic research, blogs, and fun projects related to GPT security. Additionally, it provides resources on GPT security standards, bypassing security policies, bug bounty programs, cracking GPT APIs, and plugin security.

github

: 459

h4cker

This repository is a comprehensive collection of cybersecurity-related references, scripts, tools, code, and other resources. It is carefully curated and maintained by Omar Santos. The repository serves as a supplemental material provider to several books, video courses, and live training created by Omar Santos. It encompasses over 10,000 references that are instrumental for both offensive and defensive security professionals in honing their skills.

github

: 20.4k

aircrack-ng

Aircrack-ng is a comprehensive suite of tools designed to evaluate the security of WiFi networks. It covers various aspects of WiFi security, including monitoring, attacking (replay attacks, deauthentication, fake access points), testing WiFi cards and driver capabilities, and cracking WEP and WPA PSK. The tools are command line-based, allowing for extensive scripting and have been utilized by many GUIs. Aircrack-ng primarily works on Linux but also supports Windows, macOS, FreeBSD, OpenBSD, NetBSD, Solaris, and eComStation 2.

github

: 5.2k