Awesome-AI-Security

Curated resources, research, and tools for securing AI systems

Stars: 98

Visit

Awesome-AI-Security is a curated list of resources for AI security, including tools, research papers, articles, and tutorials. It aims to provide a comprehensive overview of the latest developments in securing AI systems and preventing vulnerabilities. The repository covers topics such as adversarial attacks, privacy protection, model robustness, and secure deployment of AI applications. Whether you are a researcher, developer, or security professional, this collection of resources will help you stay informed and up-to-date in the rapidly evolving field of AI security.

README:

Awesome AI Security

Curated resources, research, and tools for securing AI systems.

Best Practices, Frameworks & Controls
- Governance & Management Frameworks
- Controls & Verification Standards
- Implementation Guides & Patterns
- Testing, Evaluation & Red Teaming
- Agentic Systems - (Standards, Governance & Patterns
- Threat Modeling
- Critical Infrastructure
Tools
Attack & Defense Matrices
Checklists
Newsletter
Datasets
Courses & Certifications
Training
Reports and Research
- Vendor Reports
- Research Papers
- Research Feed
- Reports
Communities & Social Groups
Benchmarking
Incident Response
- Incident Repositories, Trackers & Monitors
- Guides & Playbooks
Supply Chain Security
Videos & Playlists
Conferences
Foundations: Glossary, SoK/Surveys & Taxonomies
Podcasts
Market Landscape
Startups Blogs
Related Awesome Lists
Common Acronyms

Best Practices, Frameworks & Controls

Governance & Management Frameworks

Controls & Verification Standards

OWASP - LLM Security Verification Standard (LLMSVS)
OWASP - Artificial Intelligence Security Verification Standard (AISVS)
CSA - AI Controls Matrix (AICM) - The AICM contains 243 control objectives across 18 domains and maps to ISO 42001, ISO 27001, NIST AI RMF 1.0, and BSI AIC4. Freely downloadable.

Top 10s

Scoring & Rating Systems

OWASP - Artificial Intelligence Vulnerability Scoring System

Testing, Evaluation & Red Teaming

Implementation Guides & Patterns

OWASP
- OWASP - AI Security and Privacy Guide
- OWASP - LLM and Gen AI Data Security Best Practices
CSA - Secure LLM Systems: Essential Authorization Practices
OASIS CoSAI - Preparing Defenders of AI Systems
DoD CIO - AI Cybersecurity Risk Management Tailoring Guide (2025) - Practical RMF tailoring for AI systems across the lifecycle; complements CDAO’s RAI toolkit.
NCSC (UK) - Guidelines for Secure AI System Development - End-to-end secure AI SDLC (secure design, development, deployment, and secure operation & maintenance), including logging/monitoring and update management.
SANS – Critical AI Security Guidelines - Control-focused guidance for securing AI/LLM systems across six domains (e.g., access controls, data protection, inference security, monitoring, GRC).
BSI – Security of AI Systems: Fundamentals - Sector-agnostic fundamentals: lifecycle threat model (data/model/pipeline/runtime), adversarial ML attacks (poisoning, evasion, inversion, extraction, backdoors), and baseline controls for design→deploy→operate, plus assurance/certification guidance.
NSA - Artificial Intelligence Security Center (AISC)
- Deploying AI Systems Securely (CSI) - Practical, ops-focused guidance for deploying/operating externally developed AI systems (with CISA, FBI & international partners); complements NCSC’s secure-AI dev guidelines.
- AI Data Security: Best Practices for Securing Data Used to Train & Operate AI Systems (CSI) - Joint guidance on securing data across the AI lifecycle.
- Content Credentials: Strengthening Multimedia Integrity in the Generative AI Era (CSI) - Provenance and Durable Content Credentials for transparent media.
- Contextualizing Deepfake Threats to Organizations (CSI) - Risks, impacts, and mitigations for synthetic media targeting orgs.

Agentic Systems (Standards, Governance & Patterns)

Threat Modeling

OWASP - Multi-Agentic System Threat Modeling Guide — Applies OWASP’s agentic threat taxonomy to multi-agent systems and demonstrates modeling using the MAESTRO framework with worked examples.
AWS - Threat modeling your generative AI workload to evaluate security risk — Practical, four-question approach (what are we working on; what can go wrong; what are we going to do about it; did we do a good enough job) with concrete deliverables: DFDs and assumptions, threat statements using AWS’s threat grammar, mapped mitigations, and validation; includes worked examples and AWS Threat Composer templates.
Microsoft - Threat Modeling AI/ML Systems and Dependencies — Practical guidance for threat modeling AI/ML: “Key New Considerations” questions plus a threats→mitigations catalog (adversarial perturbation, data poisoning, model inversion, membership inference, model stealing) based on “Failure Modes in Machine Learning”; meant for security design reviews of products that use or depend on AI/ML.

Critical Infrastructure

DHS/CISA - Safety & Security Guidelines for Critical Infrastructure AI — Cross-lifecycle guidance for owners/operators (govern, design, develop, deploy, operate); developed with SRMAs and informed by CISA’s cross-sector risk analysis.

↑Tools

Inclusion criteria (open-source tools): must have 220+ GitHub stars, active maintenance in the last 12 months, and ≥3 contributors.

Prompt-Injection Detection & Mitigation

Detect and stop prompt-injection (direct/indirect) across inputs, context, and outputs; filter hostile content before it reaches tools or models.

(none from your current list yet)

Jailbreak & Policy Enforcement (Guardrails)

Enforce safety policies and block jailbreaks at runtime via rules/validators/DSLs, with optional human-in-the-loop for sensitive actions.

NeMo Guardrails
LLM Guard
Llama Guard
LlamaFirewall
Code Shield
Guardrails - Runtime policy enforcement for LLM apps: compose input/output validators (PII, toxicity, jailbreak/PI, regex, competitor checks), then block/redact/rewrite/retry on fail; optional server mode; also supports structured outputs (Pydantic/function-calling).

Model Artifact Scanners

Analyze serialized model files for unsafe deserialization and embedded code; verify integrity/metadata and block or quarantine on fail.

Agent Tooling and MCP Security

Scan/audit MCP servers & client configs; detect tool poisoning, unsafe flows; constrain tool access with least-privilege and audit trails.

Honeypots & Deception (MCP/LLM)

Beelzebub - Beelzebub is a honeypot framework designed to provide a secure environment for detecting and analyzing cyber attacks. It offers a low code approach for easy implementation and uses AI to mimic the behavior of a high-interaction honeypot.

Tool manifest/metadata validators

MCP Inspector
mcp-scan

Servers & Dev tooling

PortSwigger - MCP Server

Execution Sandboxing for Agent Code

Run untrusted or LLM-triggered code in isolated sandboxes (FS/network/process limits) to contain RCE and reduce blast radius.

E2B - SDK + self-hostable infra to run untrusted, LLM-generated code in isolated cloud sandboxes (Firecracker microVMs).

Gateways & Policy Proxies

Centralize auth, quotas/rate limits, cost caps, egress/DLP filters, and guardrail orchestration across all model/providers.

(none from your current list yet)

Code Review

Claude Code Security Reviewer - An AI-powered security review GitHub Action using Claude to analyze code changes for security vulnerabilities.
Vulnhuntr - Vulnhuntr leverages the power of LLMs to automatically create and analyze entire code call chains starting from remote user input and ending at server output for detection of complex, multi-step, security-bypassing vulnerabilities that go far beyond what traditional static code analysis tools are capable of performing.

Red-Teaming Harnesses & Automated Security Testing

Automate attack suites (prompt-injection, leakage, jailbreak, goal-based tasks) in CI; score results and produce regression evidence.

Goal-directed agent attack tasks

CI pipelines & regression gates

promptfoo
Agentic Radar
DeepTeam
Buttercup - Trail of Bits’ AIxCC Cyber Reasoning System: runs OSS-Fuzz-style campaigns to find vulns, then uses a multi-agent LLM patcher to generate & validate fixes for C/Java repos; ships SigNoz observability; requires at least one LLM API key.

Scoring/leaderboards & evidence reports

(none from your current list yet)

Supply Chain: AI/ML BOM and Attestation

Generate and verify AI/ML BOMs, signatures, and provenance for models/datasets/dependencies; enforce allow/deny policies.

(none from your current list yet)

Vector/Memory Store Security

Harden RAG memory: isolate namespaces, sanitize queries/content, detect poisoning/outliers, and prevent secret/PII retention.

(none from your current list yet)

Data/Model Poisoning Defenses

Detect and mitigate dataset/model poisoning and backdoors; validate training/fine-tuning integrity and prune suspicious behaviors.

Adversarial Robustness Toolbox (ART)

Sensitive Data Leak Prevention (DLP for AI)

Prevent secret/PII exfiltration in prompts/outputs via detection, redaction, and policy checks at I/O boundaries.

Presidio - PII/PHI detection & redaction for text, images, and structured data; use as a pre/post-LLM DLP filter and for dataset sanitization.

Monitoring, Logging & Anomaly Detection

Collect AI-specific security logs/signals; detect abuse patterns (PI/jailbreak/leakage), enrich alerts, and support forensics.

LangKit - LLM observability metrics toolkit (whylogs-compatible): prompt-injection/jailbreak similarity, PII patterns, hallucination/consistency, relevance, sentiment/toxicity, readability.
Alibi Detect - Production drift/outlier/adversarial detection for tabular, text, images, and time series; online/offline detectors with TF/PyTorch backends; returns scores, thresholds, and flags for alerting.

↑Attack & Defense Matrices

Matrix-style resources covering adversarial TTPs and curated defensive techniques for AI systems.

Attack

MITRE ATLAS - Adversarial TTP matrix and knowledge base for threats to AI systems.
GenAI Attacks Matrix - Matrix of TTPs targeting GenAI apps, copilots, and agents.
MCP Security Tactics, Techniques, and Procedures (TTPs)

Defense

AIDEFEND - AI Defense Framework - Interactive defensive countermeasures knowledge base with Tactics / Pillars / Phases views; maps mitigations to MITRE ATLAS, MAESTRO, and OWASP LLM risks. • Live demo: https://edward-playground.github.io/aidefense-framework/

↑Checklists

↑Supply Chain Security

Guidance and standards for securing the AI/ML software supply chain (models, datasets, code, pipelines). Primarily specs and frameworks; includes vetted TPRM templates.

Standards & Specs

Normative formats and specifications for transparency and traceability across AI components and dependencies.

OWASP - AI Bill of Materials (AIBOM) - Bill of materials format for AI components, datasets, and model dependencies.

Third-Party Assessment

Questionnaires and templates to assess external vendors, model providers, and integrators for security, privacy, and compliance.

FS-ISAC - Generative AI Vendor Evaluation & Qualitative Risk Assessment - Assessment Tool XLSX • Guide PDF - Vendor due-diligence toolkit for GenAI: risk tiering by use case, integration and data sensitivity; questionnaires across privacy, security, model development and validation, integration, legal and compliance; auto-generated reporting.

↑Videos & Playlists

Monthly curated playlists of AI-security talks, demos, incidents, and tooling.

↑Newsletter

Adversarial AI Digest - A digest of AI security research, threats, governance challenges, and best practices for securing AI systems.

↑Datasets

Meta-lists

Awesome Cybersecurity Datasets)

Dataset indexes & portals

Kaggle - Community-contributed datasets (IDS, phishing, malware URLs, incidents).
Hugging Face - Search HF datasets tagged/related to cybersecurity and threat intel.

Cybersecurity Skills

Interactive CTFs and self-contained labs for hands-on security skills (web, pwn, crypto, forensics, reversing). Used to assess practical reasoning, tool use, and end-to-end task execution.

CTF Challenges

InterCode-CTF - 100 picoCTF challenges (high-school level); categories: cryptography, web, binary exploitation (pwn), reverse engineering, forensics, miscellaneous. [Dataset+Benchmark] arXiv
NYU CTF Bench - 200 CSAW challenges (2017-2023); difficulty very easy → hard; categories: cryptography, web, binary exploitation (pwn), reverse engineering, forensics, miscellaneous. [Dataset+Benchmark] arXiv
CyBench - 40 tasks from HackTheBox, Sekai CTF, Glacier, HKCert (2022-2024); categories: cryptography, web, binary exploitation (pwn), reverse engineering, forensics, miscellaneous; difficulty grounded by first-solve time (FST). [Dataset+Benchmark] arXiv
pwn.college CTF Archive - large collection of runnable CTF challenges; commonly used as a source corpus for research. [Dataset]

Secure Code

Detection (classify vulnerable code)

Devign / CodeXGLUE-Vul - function-level C vuln detection. [Dataset+Benchmark]
DiverseVul - multi-CWE function-level detection (C/C++). [Dataset]
Big-Vul - real-world C/C++ detection (often with localization). [Dataset]

Repair & Patch Mining

CVEfixes - CVE-linked fix commits for security repair. [Dataset]
Also used for repair: Big-Vul (generate minimal diffs, then build + scan).

Runnable / Scanner Evaluation

OWASP Benchmark (Java) - runnable Java app with seeded vulns; supports SAST/DAST/IAST evaluation and scoring. [Dataset+Benchmark]
Juliet (NIST SARD) (C/C++ mirror • Java mirror ) - runnable CWE cases for detect → fix → re-test. [Dataset+Benchmark]

Phishing

Phishing dataset gap: there isn’t a public corpus that, per page, stores the URL plus full HTML/CSS/JS, images, favicon, and a screenshot. Most sources are just URL feeds; pages vanish quickly; older benchmarks drift, so models don’t generalize well. Collect a per-URL archive of all page resources, with caveats that screenshots are viewport-only and some assets may be blocked by browser safety.

PhishTank — Continuously updated dataset (API/feed); community-verified phishing URLs; labels zero-day phishing; offers webpage screenshots.
OpenPhish — Regularly updated phishing URLs with fields such as webpage info, hostname, supported language, IP presence, country code, and SSL certificate; includes brand-target stats.
PhreshPhish — 372k HTML–URL samples (119k phishing / 253k benign) with full-page HTML, URLs, timestamps, and brand targets (~185 brands) across 50+ languages; suitable for training and evaluating URL/page-based phishing detection.
Phishing.Database — Continuously updated lists of phishing domains/links/IPs (ACTIVE/INACTIVE/INVALID and NEW last hour/today); repo resets daily—download lists; status validated via PyFunceble.
UCI – Phishing Websites — 11,055 URLs (phishing and legitimate) with 30 engineered features across URL, content, and third-party signals.
Mendeley – Phishing Websites Dataset — Labeled phishing/legitimate samples; provides webpage content (HTML) for each URL.; useful for training/eval.
UCI – PhiUSIIL Phishing URL — 235,795 URLs (134,850 legitimate; 100,945 phishing) with 54 URL/content features; labels: Class 1 = legitimate, Class 0 = phishing.
MillerSmiles — Large archive of phishing email scams with the URLs used; long-running email corpus (not a live feed).

Cybersecurity Knowledge

Structured Q&A datasets assessing security knowledge and terminology. Used to evaluate factual recall and conceptual understanding.

CyberMetric

Secure Coding & Vulnerability Detection

Code snippet datasets labeled as vulnerable or secure, often tied to CWEs (Common Weakness Enumeration). Used to evaluate the model’s ability to recognize insecure code patterns and suggest secure fixes.

LLMSecEval
SecCodePLT

Deepfake

Audio (Speech) Deepfakes

ASVspoof 5 - train / dev / eval - Train: 8 TTS attacks; Dev: 8 unseen (validation/fusion); Eval: 16 unseen incl. adversarial/codec. Labels: bona-fide / spoofed. arXiv
In-the-Wild (ITW) - 58 politicians/celebrities with per-speaker pairing; ≈20.7 h bona-fide + 17.2 h spoofed, scraped from social/video platforms. Labels: bona-fide / spoofed. arXiv
MLAAD (+M-AILABS) - Multilingual synthetic TTS corpus (hundreds of hours; many models/languages). Labels: bona-fide (M-AILABS) / spoof (MLAAD). arXiv
LlamaPartialSpoof - LLM-driven attacker styles; includes full and partial (spliced) spoofs. Labels: bona-fide / fully-spoofed / partially-spoofed. arXiv
Fake-or-Real (FoR) - >195k utterances; four variants: for-original, for-norm, for-2sec, for-rerec. Labels: real / synthetic.
CodecFake - codec-based deepfake audio dataset (Interspeech 2024); Labels: real / codec-generated fake. arXiv

Video Deepfakes

Jailbreak & Guardrail Evaluation

Adversarial prompt datasets-both text-only and multimodal-designed to bypass safety mechanisms or test refusal logic. Used to test how effectively a model resists jailbreaks and enforces policy-based refusal.

Prompt Injection

Public prompt-injection datasets have recurring limitations: partial staleness as models and defenses evolve, CTF skew toward basic instruction following, and label mixing across toxicity, jailbreak roleplay, and true injections that inflates measured true positive rates and distorts evaluation.

prompt-injection-attack-dataset 3.7k rows pairing benign task prompts with attack variants (naive / escape / ignore / fake-completion / combined). Columns for both target and injected tasks; train split only.
prompt-injections-benchmark 5,000 prompts labeled jailbreak / benign for robustness evals.
prompt_injections ~1k short injection prompts; multilingual (EN, FR, DE, ES, IT, PT, RO); single train split; CSV/Parquet.
prompt-injection Large-scale injection/benign corpus (~327k rows, train/test) for training baselines and detectors.
prompt-injection-safety 60k rows (train 50k / test 10k); 3-way labels: benign 0, injection 1, harmful request 2; Parquet.

System Prompts

Collections of leaked, official, and synthetic system prompts and paired responses used to study guardrails and spot system prompt exposure. Used to build leakage detectors, craft targeted guardrail tests (consent gates, tool use rules, safety policies), and reproduce vendor behaviors for evaluation.

Official_LLM_System_Prompts - leaked and date-stamped prompts from proprietary assistants (OpenAI, Anthropic, MS Copilot, GitHub Copilot, Grok, Perplexity); 29 rows.
system-prompt-leakage - synthetic prompts + responses for leakage detection; train 283,353 / test 71,351 (binary leakage labels).
system-prompts-and-models-of-ai-tools - community collection of prompts and internal tool configs for code/IDE agents and apps (Cursor, VSCode Copilot Agent, Windsurf, Devin, v0, etc.); includes a security notice.
system_prompts_leaks - collection of extracted system prompts from popular chatbots like ChatGPT, Claude & Gemini
leaked-system-prompts - leaked prompts across many services; requires verifiable sources or reproducible prompts for PRs.
chatgpt_system_prompt - community collection of GPT system prompts, prompt-injection/leak techniques, and protection prompts.
CL4R1T4S - extracted/leaked prompts, guidelines, and tooling references spanning major assistants and agents (OpenAI, Google, Anthropic, xAI, Perplexity, Cursor, Devin, etc.).
grok-prompts - official xAI repository publishing Grok’s system prompts for chat/X features (DeepSearch, Ask Grok, Explain, etc.).

↑Courses & Certifications

Career Pathways

SANS - AI Cybersecurity Careers - Career pathways poster + training map; baseline skills for AI security (IR, DFIR, detection, threat hunting).

Courses (includes labs)

SANS - SEC545: GenAI & LLM Application Security - Hands-on course covering prompt injection, excessive agency, model supply chain, and defensive patterns. (Certificate of completion provided by SANS.)
SANS - SEC495: Leveraging LLMs: Building & Securing RAG, Contextual RAG, and Agentic RAG - Practical RAG builds with threat modeling, validation, and guardrails. (Certificate of completion provided by SANS.)
Practical DevSecOps - Certified AI Security Professional (CAISP) - Hands-on labs covering LLM Top 10, AI Attack and Defend techniques, MITRE ATLAS Framework, AI Threat Modeling, AI supply chain attacks, Secure AI Deployment, and AI Governance. (Certificate of completion provided by Practical DevSecOps.)

Professional Certifications (exam-based)

IAPP - Artificial Intelligence Governance Professional (AIGP) - Governance-focused credential aligned with emerging regulations.
ISACA - Advanced in AI Security Management (AAISM™) - AI-centric security management certification.
NIST AI RMF 1.0 Architect - Certified Information Security - Credential aligned to NIST AI RMF 1.0.
ISO/IEC 23894 - AI Risk Management (AI Risk Manager, PECB) - Risk identification, assessment, and mitigation aligned to ISO/IEC 23894 and NIST AI RMF.
ISO/IEC 42001 - AI Management System (Lead Implementer, PECB) - Implement an AIMS per ISO/IEC 42001.
ISO/IEC 42001 - AI Management System (Lead Auditor, PECB) - Audit AIMS using recognized principles.
ISACA - Advanced in AI Audit (AAIA™) - Certification for auditing AI systems and mitigating AI-related risks.
Practical DevSecOps - Certified AI Security Professional (CAISP) - Challenge-based exam certification simulating real-world AI security scenarios. 5 Challenges and 6 hours duration and report submission.

↑Training

Provider Training Portals

Microsoft AI Security Learning Path - Free, self-paced Microsoft content on secure AI model development, risk management, and threat mitigation.
AWS AI Security Training - Free AWS portal with courses on securing AI applications, risk management, and AI/ML security best practices.

Guided Tracks

PortSwigger - Web Security Academy: Web LLM attacks - Structured, guided track on LLM issues (prompt injection, insecure output handling, excessive agency) with walkthrough-style exercises.

CTFs & Challenges

AI GOAT - Vulnerable LLM CTF challenges for learning AI security.
Damn Vulnerable LLM Agent
AI Red Teaming Playground Labs - Microsoft - Self-hostable environment with 12 challenges (direct/indirect prompt injection, metaprompt extraction, Crescendo multi-turn, guardrail bypass).

Bespoke

Trail of Bits - AI/ML Security & Safety Training - Courses on AI failure modes, adversarial attacks, data provenance, pipeline threats, and mitigation.

↑Research Working Groups

Cloud Security Alliance (CSA) AI Security Working Groups - Collaborative research groups focused on AI security, cloud security, and emerging threats in AI-driven systems.
OWASP Top 10 for LLM & Generative AI Security Risks Project - An open-source initiative addressing critical security risks in Large Language Models (LLMs) and Generative AI applications, offering resources and guidelines to mitigate emerging threats.
CWE Artificial Intelligence Working Group (AI WG) - The AI WG was established by CWE™ and CVE® community stakeholders to identify and address gaps in the CWE corpus where AI-related weaknesses are not adequately covered, and work collaboratively to fix them.
NIST - SP 800-53 Control Overlays for Securing AI Systems (COSAiS) - Public collaboration to develop AI security control overlays with NIST principal investigators and the community.
OpenSSF - AI/ML Security Working Group - Cross-org WG on “security for AI” and “AI for security”
CoSAI - Coalition for Secure AI (OASIS Open Project) - Open, cross-industry initiative advancing secure-by-design AI through shared frameworks, tooling, and guidance.
- WS1: Software Supply Chain Security for AI Systems - Extends SSDF/SLSA principles to AI; provenance, model risks, and pipeline security.https://github.com/cosai-oasis/ws1-supply-chain
- WS2: Preparing Defenders for a Changing Cybersecurity Landscape - Defender-focused framework aligning threats, mitigations, and investments for AI-driven ops. https://github.com/cosai-oasis/ws2-defenders
  • Reference doc: “Preparing Defenders of AI Systems” https://github.com/cosai-oasis/ws2-defenders/blob/main/preparing-defenders-of-ai-systems.md
- WS3: AI Security Risk Governance - Security-focused risk & controls taxonomy, checklist, and scorecard for AI products and components.https://github.com/cosai-oasis/ws3-ai-risk-governance
- WS4: Secure Design Patterns for Agentic Systems - Threat models and secure design patterns for agentic systems and infrastructure. https://github.com/cosai-oasis/ws4-secure-design-agentic-systems

📌 (More working groups to be added.)

↑Communities & Social Groups

AI Security Hub (LinkedIn Group)

↑Benchmarking

Benchmarks

Purple Llama - CyberSecEval
JailbreakBench

Categories of AI Security Benchmarks

Adversarial Resilience

Purpose: Evaluates how AI systems withstand adversarial attacks, including evasion, poisoning, and model extraction. Ensures AI remains functional under manipulation.
NIST AI RMF Alignment: Measure, Manage

Measure: Identify risks related to adversarial attacks.
Manage: Implement mitigation strategies to ensure resilience.

AutoPenBench - 33 tasks: 22 in-vitro fundamentals (incl. 4 crypto) + 11 real-world CVEs for autonomous pentesting evaluation. arXiv • Best for: controlled, task-based coverage across fundamentals and known CVEs (repeatable, fine-grained scoring).

AI-Pentest-Benchmark - 13 full vulnerable VMs (from VulnHub), 152 subtasks across Recon (72), Exploit (44), PrivEsc (22), and General (14), for end-to-end recon → exploit → privesc benchmarking. arXiv • Best for: realistic, end-to-end machine takeovers stressing planning, tool use, and multi-step reasoning.

CVE-Bench - 40 real-world web CVEs in dockerized apps; evaluates agent-driven exploit generation/execution. arXiv • Best for: focused testing of exploitability against real CVEs (web).

NYU CTF Bench - 200 dockerized CSAW challenges (web, pwn, rev, forensics, crypto, misc.) for skill-granular agent evaluation. arXiv • Best for: CTF-style, per-skill assessment and tool-use drills.

Prompt Injection & Jailbreak Detection

Purpose: Evaluates resistance to prompt-injection and jailbreak attempts in chat/RAG/agent contexts.
NIST AI RMF Alignment: Measure, Manage

Lakera PINT Benchmark Prompt-injection benchmark with a curated multilingual test suite, explicit categories (injections, jailbreaks, hard negatives, benign chats/docs), and a reproducible scoring harness (PINT score + notebooks) for fair detector comparison and regression tracking.

Model & Data Integrity

Purpose: Assesses AI models for unauthorized modifications, including backdoors and dataset poisoning. Supports trustworthiness and security of model outputs.
NIST AI RMF Alignment: Map, Measure

Map: Understand and identify risks to model/data integrity.
Measure: Evaluate and mitigate risks through validation techniques.
CVE-Bench - @uiuc-kang-lab - How well AI agents can exploit real-world software vulnerabilities that are listed in the CVE database.

Governance & Compliance

Purpose: Ensures AI security aligns with governance frameworks, industry regulations, and security policies. Supports auditability and risk management.
NIST AI RMF Alignment: Govern

Govern: Establish policies, accountability structures, and compliance controls.

Privacy & Data Protection

Purpose: Evaluates AI for risks like data leakage, membership inference, and model inversion. Helps ensure privacy preservation and compliance.
NIST AI RMF Alignment: Measure, Manage

Measure: Identify and assess AI-related privacy risks.
Manage: Implement security controls to mitigate privacy threats.

Explainability & Trustworthiness

Purpose: Assesses AI for transparency, fairness, and bias mitigation. Ensures AI operates in an interpretable and ethical manner.
NIST AI RMF Alignment: Govern, Map, Measure

Govern: Establish policies for fairness, bias mitigation, and transparency.
Map: Identify potential explainability risks in AI decision-making.
Measure: Evaluate AI outputs for fairness, bias, and interpretability.

↑Incident Response

Incident Repositories, Trackers & Monitors

AI Incident Database (AIID)
MIT AI Risk Repository - Incident Tracker
AIAAIC Repository
OECD.AI - AIM: AI Incidents and Hazards Monitor
AVID - AI Vulnerability Database - Open, taxonomy-driven catalog of AI failure modes; Vulnerabilities, Reports map incidents to failure modes/lifecycle stages.

Guides & Playbooks

OWASP - GenAI Incident Response Guide
OWASP - Guide for Preparing & Responding to Deepfake Events
CISA - JCDC AI Cybersecurity Collaboration Playbook - Info-sharing & coordination procedures for AI incidents.
eSafety Commissioner - Guide to responding to image-based abuse involving AI deepfakes (PDF) - Practical, step-by-step playbook (school-focused but adaptable) covering reporting/takedown, evidence preservation, and support.

Regulatory Incident Reporting

EU AI Act - Article 73: Reporting of Serious Incidents - Providers of high-risk AI systems need to report serious incidents to national authorities.

↑Reports and Research

Vendor Reports

Vendor Reports

Research Papers

Research Papers

Research Feed

AI Security Research Feed - Continuously updated feed of AI security-related academic papers, preprints, and research indexed from arXiv.
AI Security Portal - Literature Database - Categorized database of AI security literature, taxonomy, and related resources.

Industry Alliance & Nonprofit Reports

CSA - Principles to Practice: Responsible AI in a Dynamic Regulatory Environment
CSA - AI Resilience: A Revolutionary Benchmarking Model for AI Safety - Governance & compliance benchmarking model.
CSA - Using AI for Offensive Security

📌 (More to be added - A collection of AI security reports, white papers, and academic studies.)

↑Foundations: Glossary, SoK/Surveys & Taxonomies

(Core references and syntheses for orientation and shared language.)

Glossary

(Authoritative definitions for AI/ML security, governance, and risk-use to align terminology across docs and reviews.)

NIST - “The Language of Trustworthy AI: An In-Depth Glossary of Terms.” - Authoritative cross-org terminology aligned to NIST AI RMF; useful for standardizing terms across teams.
ISO/IEC 22989:2022 - Artificial intelligence - Concepts and terminology - International standard that formalizes core AI concepts and vocabulary used in policy and engineering.

SoK & Surveys

(Systematizations of Knowledge (SoK), surveys, systematic reviews, and mapping studies.)

Taxonomy

(Reusable classification schemes-clear dimensions, categories, and labeling rules for attacks, defenses, datasets, and risks.)

CSA - Large Language Model (LLM) Threats Taxonomy - Community taxonomy of LLM-specific threats; clarifies categories/definitions for risk discussion and control mapping.
ARC - PI (Prompt Injection) Taxonomy - Focused taxonomy for prompt-injection behaviors/variants with practical labeling guidance for detection and defense.

↑Podcasts

The MLSecOps Podcast - Insightful conversations with industry leaders and AI experts, exploring the fascinating world of machine learning security operations.

↑Market Landscape

Curated market maps of tools and vendors for securing LLM and agentic AI applications across the lifecycle.

OWASP - LLM and Generative AI Security Solutions Landscape
OWASP - AI Security Solutions Landscape for Agentic AI
Latio - 2025 AI Security Report - Market trends and vendor landscape snapshot for AI security.
Woodside Capital Partners - Cybersecurity Sector - A snapshot with vendor breakdowns and landscape view.
Insight Partners - Cybersecurity Portfolio Overview (Market Map) - Visual market map and portfolio overview across cybersecurity domains.

↑Startups Blogs

A curated list of startups securing agentic AI applications, organized by the OWASP Agentic AI lifecycle (Scope & Plan → Govern). Each company appears once in its best-fit stage based on public positioning, and links point to blog/insights for deeper context. Some startups span multiple stages; placements reflect primary focus.

Inclusion criteria

Startup has not been acquired
Has an active blog
Has an active GitHub organization/repository

Scope & Plan

Design-time security: non-human identities, agent threat modeling, privilege boundaries/authn, and memory scoping/isolation.

no startups here with active blog and active GitHub account

Develop & Experiment

Secure agent loops and tool use; validate I/O contracts; embed policy hooks; test resilience during co-engineering.

no startups here with active blog and active GitHub account

Augment & Fine-Tune Data

Sanitize/trace data and reasoning; validate alignment; protect sensitive memory with privacy controls before deployment.

Skyflow

Test & Evaluate

Adversarial testing for goal drift, prompt injection, and tool misuse; red-team sims; sandboxed calls; decision validation.

Release

Sign models/plugins/memory; verify SBOMs; enforce cryptographically validated policies; register agents/capabilities.

no startups here with active blog and active GitHub account

Deploy

Zero-trust activation: rotate ephemeral creds, apply allowlists/LLM firewalls, and fine-grained least-privilege authorization.

Pomerium

Operate

Monitor memory mutations for drift/poisoning, detect abnormal loops/misuse, enforce HITL overrides, and scan plugins-continuous, real-time vigilance for resilient operations as systems scale and self-orchestrate.

Lasso Security

Monitor

Correlate agent steps/tools/comms; detect anomalies (e.g., goal reversal); keep immutable logs for auditability.

Govern

Enforce role/task policies, version/retire agents, prevent privilege creep, and align evidence with AI regulations.

↑Related Awesome Lists

↑Common Acronyms

Acronym	Full Form
AI	Artificial Intelligence
AGI	Artificial General Intelligence
ALBERT	A Lite BERT
AOC	Area Over Curve
ASR	Attack Success Rate
BERT	Bidirectional Encoder Representations from Transformers
BGMAttack	Black-box Generative Model-based Attack
CBA	Composite Backdoor Attack
CCPA	California Consumer Privacy Act
CNN	Convolutional Neural Network
CoT	Chain-of-Thought
DAN	Do Anything Now
DFS	Depth-First Search
DNN	Deep Neural Network
DPO	Direct Preference Optimization
DP	Differential Privacy
FL	Federated Learning
GA	Genetic Algorithm
GDPR	General Data Protection Regulation
GPT	Generative Pre-trained Transformer
GRPO	Group Relative Policy Optimization
HIPAA	Health Insurance Portability and Accountability Act
ICL	In-Context Learning
KL	Kullback-Leibler Divergence
LAS	Leakage-Adjusted Simulatability
LM	Language Model
LLM	Large Language Model
Llama	Large Language Model Meta AI
LoRA	Low-Rank Adapter
LRM	Large Reasoning Model
MCTS	Monte-Carlo Tree Search
MIA	Membership Inference Attack
MDP	Masking-Differential Prompting
MLM	Masked Language Model
MLLM	Multimodal Large Language Model
MLRM	Multimodal Large Reasoning Model
MoE	Mixture-of-Experts
NLP	Natural Language Processing
OOD	Out Of Distribution
ORM	Outcome Reward Model
PI	Prompt Injection
PII	Personally Identifiable Information
PAIR	Prompt Automatic Iterative Refinement
PLM	pre-trained Language Model
PRM	Process Reward Model
QA	Question-Answering
RAG	Retrieval-Augmented Generation
RL	Reinforcement Learning
RLHF	Reinforcement Learning from Human Feedback
RLVR	Reinforcement Learning with Verifiable Reward
RoBERTa	Robustly optimized BERT approach
SCM	Structural Causal Model
SGD	Stochastic Gradient Descent
SOTA	State of the Art
TAG	Gradient Attack on Transformer-based Language Models
VR	Verifiable Reward
XLNet	Transformer-XL with autoregressive and autoencoding pre-training

↑Contributing

Contributions are welcome! If you have new resources, tools, or insights to add, feel free to submit a pull request.

This repository follows the Awesome Manifesto guidelines.

↑License

For Tasks:

Click tags to check more tools for each tasks

detect vulnerabilities protect privacy ensure model robustness prevent attacks secure ai systems

For Jobs:

security analyst ai researcher data scientist cybersecurity consultant machine learning engineer

Alternative AI tools for Awesome-AI-Security

Similar Open Source Tools

Awesome-AI-Security

github

: 98

Disciplined-AI-Software-Development

Disciplined AI Software Development is a comprehensive repository that provides guidelines and best practices for developing AI software in a disciplined manner. It covers topics such as project organization, code structure, documentation, testing, and deployment strategies to ensure the reliability, scalability, and maintainability of AI applications. The repository aims to help developers and teams navigate the complexities of AI development by offering practical advice and examples to follow.

github

: 258

lemonai

LemonAI is a versatile machine learning library designed to simplify the process of building and deploying AI models. It provides a wide range of tools and algorithms for data preprocessing, model training, and evaluation. With LemonAI, users can easily experiment with different machine learning techniques and optimize their models for various tasks. The library is well-documented and beginner-friendly, making it suitable for both novice and experienced data scientists. LemonAI aims to streamline the development of AI applications and empower users to create innovative solutions using state-of-the-art machine learning methods.

github

: 994

ai-manus

AI Manus is a general-purpose AI Agent system that supports running various tools and operations in a sandbox environment. It offers deployment with minimal dependencies, supports multiple tools like Terminal, Browser, File, Web Search, and messaging tools, allocates separate sandboxes for tasks, manages session history, supports stopping and interrupting conversations, file upload and download, and is multilingual. The system also provides user login and authentication. The project primarily relies on Docker for development and deployment, with model capability requirements and recommended Deepseek and GPT models.

github

: 976

God-Level-AI

A drill of scientific methods, processes, algorithms, and systems to build stories & models. An in-depth learning resource for humans. This repository is designed for individuals aiming to excel in the field of Data and AI, providing video sessions and text content for learning. It caters to those in leadership positions, professionals, and students, emphasizing the need for dedicated effort to achieve excellence in the tech field. The content covers various topics with a focus on practical application.

github

: 3.5k

ramalama

The Ramalama project simplifies working with AI by utilizing OCI containers. It automatically detects GPU support, pulls necessary software in a container, and runs AI models. Users can list, pull, run, and serve models easily. The tool aims to support various GPUs and platforms in the future, making AI setup hassle-free.

github

: 2.1k

open-ai

Open AI is a powerful tool for artificial intelligence research and development. It provides a wide range of machine learning models and algorithms, making it easier for developers to create innovative AI applications. With Open AI, users can explore cutting-edge technologies such as natural language processing, computer vision, and reinforcement learning. The platform offers a user-friendly interface and comprehensive documentation to support users in building and deploying AI solutions. Whether you are a beginner or an experienced AI practitioner, Open AI offers the tools and resources you need to accelerate your AI projects and stay ahead in the rapidly evolving field of artificial intelligence.

github

: 2.1k

agents

Cloudflare Agents is a framework for building intelligent, stateful agents that persist, think, and evolve at the edge of the network. It allows for maintaining persistent state and memory, real-time communication, processing and learning from interactions, autonomous operation at global scale, and hibernating when idle. The project is actively evolving with focus on core agent framework, WebSocket communication, HTTP endpoints, React integration, and basic AI chat capabilities. Future developments include advanced memory systems, WebRTC for audio/video, email integration, evaluation framework, enhanced observability, and self-hosting guide.

github

: 2.5k

azure-ai-docs

Azure AI Docs is a repository that provides detailed documentation and resources for developers looking to leverage Microsoft's AI services on the Azure platform. The repository covers a wide range of topics including machine learning, natural language processing, computer vision, and more. Developers can find tutorials, code samples, best practices, and guidelines to help them integrate AI capabilities into their applications seamlessly.

github

: 104

ml-retreat

ML-Retreat is a comprehensive machine learning library designed to simplify and streamline the process of building and deploying machine learning models. It provides a wide range of tools and utilities for data preprocessing, model training, evaluation, and deployment. With ML-Retreat, users can easily experiment with different algorithms, hyperparameters, and feature engineering techniques to optimize their models. The library is built with a focus on scalability, performance, and ease of use, making it suitable for both beginners and experienced machine learning practitioners.

github

: 2.2k

ai-workshop-code

The ai-workshop-code repository contains code examples and tutorials for various artificial intelligence concepts and algorithms. It serves as a practical resource for individuals looking to learn and implement AI techniques in their projects. The repository covers a wide range of topics, including machine learning, deep learning, natural language processing, computer vision, and reinforcement learning. By exploring the code and following the tutorials, users can gain hands-on experience with AI technologies and enhance their understanding of how these algorithms work in practice.

github

: 375

xllm

xLLM is an efficient LLM inference framework optimized for Chinese AI accelerators, enabling enterprise-grade deployment with enhanced efficiency and reduced cost. It adopts a service-engine decoupled inference architecture, achieving breakthrough efficiency through technologies like elastic scheduling, dynamic PD disaggregation, multi-stream parallel computing, graph fusion optimization, and global KV cache management. xLLM supports deployment of mainstream large models on Chinese AI accelerators, empowering enterprises in scenarios like intelligent customer service, risk control, supply chain optimization, ad recommendation, and more.

github

: 462

cs-self-learning

This repository serves as an archive for computer science learning notes, codes, and materials. It covers a wide range of topics including basic knowledge, AI, backend & big data, tools, and other related areas. The content is organized into sections and subsections for easy navigation and reference. Users can find learning resources, programming practices, and tutorials on various subjects such as languages, data structures & algorithms, AI, frameworks, databases, development tools, and more. The repository aims to support self-learning and skill development in the field of computer science.

github

: 53

koog

Koog is a Kotlin-based framework for building and running AI agents entirely in idiomatic Kotlin. It allows users to create agents that interact with tools, handle complex workflows, and communicate with users. Key features include pure Kotlin implementation, MCP integration, embedding capabilities, custom tool creation, ready-to-use components, intelligent history compression, powerful streaming API, persistent agent memory, comprehensive tracing, flexible graph workflows, modular feature system, scalable architecture, and multiplatform support.

github

: 3.2k

jadx-ai-mcp

JADX-AI-MCP is a plugin for the JADX decompiler that integrates with Model Context Protocol (MCP) to provide live reverse engineering support with LLMs like Claude. It allows for quick analysis, vulnerability detection, and AI code modification, all in real time. The tool combines JADX-AI-MCP and JADX MCP SERVER to analyze Android APKs effortlessly. It offers various prompts for code understanding, vulnerability detection, reverse engineering helpers, static analysis, AI code modification, and documentation. The tool is part of the Zin MCP Suite and aims to connect all android reverse engineering and APK modification tools with a single MCP server for easy reverse engineering of APK files.

github

: 493

pdr_ai_v2

pdr_ai_v2 is a Python library for implementing machine learning algorithms and models. It provides a wide range of tools and functionalities for data preprocessing, model training, evaluation, and deployment. The library is designed to be user-friendly and efficient, making it suitable for both beginners and experienced data scientists. With pdr_ai_v2, users can easily build and deploy machine learning models for various applications, such as classification, regression, clustering, and more.

github

: 599

For similar tasks

watchtower

AIShield Watchtower is a tool designed to fortify the security of AI/ML models and Jupyter notebooks by automating model and notebook discoveries, conducting vulnerability scans, and categorizing risks into 'low,' 'medium,' 'high,' and 'critical' levels. It supports scanning of public GitHub repositories, Hugging Face repositories, AWS S3 buckets, and local systems. The tool generates comprehensive reports, offers a user-friendly interface, and aligns with industry standards like OWASP, MITRE, and CWE. It aims to address the security blind spots surrounding Jupyter notebooks and AI models, providing organizations with a tailored approach to enhancing their security efforts.

github

: 187

LLM-PLSE-paper

LLM-PLSE-paper is a repository focused on the applications of Large Language Models (LLMs) in Programming Language and Software Engineering (PL/SE) domains. It covers a wide range of topics including bug detection, specification inference and verification, code generation, fuzzing and testing, code model and reasoning, code understanding, IDE technologies, prompting for reasoning tasks, and agent/tool usage and planning. The repository provides a comprehensive collection of research papers, benchmarks, empirical studies, and frameworks related to the capabilities of LLMs in various PL/SE tasks.

github

: 125

invariant

Invariant Analyzer is an open-source scanner designed for LLM-based AI agents to find bugs, vulnerabilities, and security threats. It scans agent execution traces to identify issues like looping behavior, data leaks, prompt injections, and unsafe code execution. The tool offers a library of built-in checkers, an expressive policy language, data flow analysis, real-time monitoring, and extensible architecture for custom checkers. It helps developers debug AI agents, scan for security violations, and prevent security issues and data breaches during runtime. The analyzer leverages deep contextual understanding and a purpose-built rule matching engine for security policy enforcement.

github

: 143

OpenRedTeaming

OpenRedTeaming is a repository focused on red teaming for generative models, specifically large language models (LLMs). The repository provides a comprehensive survey on potential attacks on GenAI and robust safeguards. It covers attack strategies, evaluation metrics, benchmarks, and defensive approaches. The repository also implements over 30 auto red teaming methods. It includes surveys, taxonomies, attack strategies, and risks related to LLMs. The goal is to understand vulnerabilities and develop defenses against adversarial attacks on large language models.

github

: 68

Awesome-LLM4Cybersecurity

The repository 'Awesome-LLM4Cybersecurity' provides a comprehensive overview of the applications of Large Language Models (LLMs) in cybersecurity. It includes a systematic literature review covering topics such as constructing cybersecurity-oriented domain LLMs, potential applications of LLMs in cybersecurity, and research directions in the field. The repository analyzes various benchmarks, datasets, and applications of LLMs in cybersecurity tasks like threat intelligence, fuzzing, vulnerabilities detection, insecure code generation, program repair, anomaly detection, and LLM-assisted attacks.

github

: 681

quark-engine

Quark Engine is an AI-powered tool designed for analyzing Android APK files. It focuses on enhancing the detection process for auto-suggestion, enabling users to create detection workflows without coding. The tool offers an intuitive drag-and-drop interface for workflow adjustments and updates. Quark Agent, the core component, generates Quark Script code based on natural language input and feedback. The project is committed to providing a user-friendly experience for designing detection workflows through textual and visual methods. Various features are still under development and will be rolled out gradually.

github

: 1.4k

vulnerability-analysis

The NVIDIA AI Blueprint for Vulnerability Analysis for Container Security showcases accelerated analysis on common vulnerabilities and exposures (CVE) at an enterprise scale, reducing mitigation time from days to seconds. It enables security analysts to determine software package vulnerabilities using large language models (LLMs) and retrieval-augmented generation (RAG). The blueprint is designed for security analysts, IT engineers, and AI practitioners in cybersecurity. It requires NVAIE developer license and API keys for vulnerability databases, search engines, and LLM model services. Hardware requirements include L40 GPU for pipeline operation and optional LLM NIM and Embedding NIM. The workflow involves LLM pipeline for CVE impact analysis, utilizing LLM planner, agent, and summarization nodes. The blueprint uses NVIDIA NIM microservices and Morpheus Cybersecurity AI SDK for vulnerability analysis.

github

: 86

CodeAsk

CodeAsk is a code analysis tool designed to tackle complex issues such as code that seems to self-replicate, cryptic comments left by predecessors, messy and unclear code, and long-lasting temporary solutions. It offers intelligent code organization and analysis, security vulnerability detection, code quality assessment, and other interesting prompts to help users understand and work with legacy code more efficiently. The tool aims to translate 'legacy code mountains' into understandable language, creating an illusion of comprehension and facilitating knowledge transfer to new team members.

github

: 820

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 980

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.9k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 32.1k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675

Awesome-AI-Security

README:

Awesome AI Security

Table of Contents

Best Practices, Frameworks & Controls

Governance & Management Frameworks

Controls & Verification Standards

Top 10s

Scoring & Rating Systems

Testing, Evaluation & Red Teaming

Implementation Guides & Patterns

Agentic Systems (Standards, Governance & Patterns)

Threat Modeling

Critical Infrastructure

↑Tools

Prompt-Injection Detection & Mitigation

Jailbreak & Policy Enforcement (Guardrails)

Model Artifact Scanners

Agent Tooling and MCP Security

Honeypots & Deception (MCP/LLM)

Tool manifest/metadata validators

Servers & Dev tooling

Execution Sandboxing for Agent Code

Gateways & Policy Proxies

Code Review

Red-Teaming Harnesses & Automated Security Testing

Prompt-injection test suites

Data-leakage/secret-exfil test suites

Jailbreak catalogs & adversarial prompts

Adversarial-robustness (evasion) toolkits

Goal-directed agent attack tasks

CI pipelines & regression gates

Scoring/leaderboards & evidence reports

Supply Chain: AI/ML BOM and Attestation

Vector/Memory Store Security

Data/Model Poisoning Defenses

Sensitive Data Leak Prevention (DLP for AI)

Monitoring, Logging & Anomaly Detection

↑Attack & Defense Matrices

Attack

Defense

↑Checklists

↑Supply Chain Security

Standards & Specs

Third-Party Assessment

↑Videos & Playlists

↑Newsletter

↑Datasets

Meta-lists

Dataset indexes & portals

Cybersecurity Skills

CTF Challenges

Secure Code

Detection (classify vulnerable code)

Repair & Patch Mining

Runnable / Scanner Evaluation

Phishing

Cybersecurity Knowledge

Secure Coding & Vulnerability Detection

Deepfake

Audio (Speech) Deepfakes

Video Deepfakes

Jailbreak & Guardrail Evaluation

Prompt Injection

System Prompts

↑Courses & Certifications

Career Pathways

Courses (includes labs)

Professional Certifications (exam-based)

↑Training

Provider Training Portals

Guided Tracks

CTFs & Challenges

Bespoke

↑Research Working Groups

↑Communities & Social Groups

↑Benchmarking

Benchmarks

Categories of AI Security Benchmarks

Adversarial Resilience