ai-atlas-nexus

AI Atlas Nexus: tooling to bring together resources related to governance of foundation models.

Stars: 114

Visit

AI Atlas Nexus provides tooling to bring together resources related to governance of foundation models. It supports a community-driven approach to curating and cataloguing resources such as datasets, benchmarks, and mitigations. The goal is to streamline AI governance processes by turning abstract risk definitions into actionable workflows. By connecting fragmented resources, AI Atlas Nexus fills a critical gap in AI governance, enabling stakeholders to build more robust, transparent, and accountable systems. The tool builds on the IBM AI Risk Atlas, creating a nexus of governance assets and tooling using a knowledge graph of an AI system to provide a unified structure that links and contextualizes heterogeneous domain data. The project aims to create an open AI Systems ontology focused on risk that the community can extend and enhance, fostering a governance-first approach to AI solutions and inviting contributions to expand its impact.

README:

AI Atlas Nexus

👉 (Jun-2025) The demo projects repository showcases implementations of AI Atlas Nexus.

Overview

AI Atlas Nexus provides tooling to bring together resources related to governance of foundation models. We support a community-driven approach to curating and cataloguing resources such as datasets, benchmarks and mitigations. Our goal is to turn abstract risk definitions into actionable workflows that streamline AI governance processes. By connecting fragmented resources, AI Atlas Nexus seeks to fill a critical gap in AI governance, enabling stakeholders to build more robust, transparent, and accountable systems. AI Atlas Nexus builds on the IBM AI Risk Atlas making this educational resource a nexus of governance assets and tooling. A knowledge graph of an AI system is used to provide a unified structure that links and contextualizes the very heterogeneous domain data.

Our intention is to create a starting point for an open AI Systems ontology whose focus is on risk and that the community can extend and enhance. This ontology serves as the foundation that unifies innovation and tooling in the AI risk space. By lowering the barrier to entry for developers, it fosters a governance-first approach to AI solutions, while also inviting the broader community to contribute their own tools and methodologies to expand its impact.

Features

🏗️ An ontology that combines the AI risk view (taxonomies, risks, actions) with an AI model view (AI systems, AI models, model evaluations) into one coherent schema
📚 AI Risks collected from IBM AI Risk Atlas, IBM Granite Guardian, MIT AI Risk Repository, NIST Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile, the AI Risk Taxonomy (AIR 2024), the AILuminate Benchmark, Credo's Unified Control Framework, and OWASP Top 10 for Large Language Model Applications
🔗 Mappings are proposed between the taxonomies and between risks and actions
🐍 Use the python library methods to quickly explore available risks, relations and actions
🚨 Use the python library methods to detect potential risks in your usecase
📤 Download an exported graph populated with data instances
✨ Example use-case of auto-assistance in compliance questionnaires using CoT examples and AI Atlas Nexus
🔧 Tooling to convert the LinkML schema and instance data into a Cypher representation to populate a graph database

Architecture

Links

AI Risk Ontology
Notebooks:
- AI Atlas Nexus Quickstart Overview of library functionality
- Risk identification Uncover risks related to your usecase
- Auto assist questionnaire Auto-fill questionnaire using Chain of Thought or Few-Shot Examples
- AI Tasks identification Uncover ai tasks related to your usecase
- AI Domain identification Uncover ai domain from your usecase
- Risk Categorization Assess and categorize the severity of risks associated with an AI system usecase. Prompt templates are used with thanks to https://doi.org/10.48550/arXiv.2407.12454.
- Crosswalk An example of generating crosswalk information between risks of two different taxonomies.
- Risk to ARES Evaluation ARES Integration for AI Atlas Nexus allows you to run AI robustness evaluations on AI Systems derived from use cases.
Additional Resources:
- Demonstrations A repo containing some demo applications using ai-atlas-nexus.
- Extensions A repo containing extensions and a cookie cutter template to create new open source ai-atlas-nexus extensions.
- IBM AI Risk Atlas
- Usage Governance Advisor: From Intent to AI Governance

Installation

This project targets python version ">=3.11, <3.12". You can download specific versions of python here: https://www.python.org/downloads/

Note: Replace INFERENCE_LIB with one of the LLM inference library [ollama, vllm, wml, rits] as explained here

To install the current release

pip install "ai-atlas-nexus[INFERENCE_LIB]"

To install the latest code

git clone [email protected]:IBM/ai-atlas-nexus.git
cd ai-atlas-nexus
python -m venv v-ai-atlas-nexus
source v-ai-atlas-nexus/bin/activate
pip install -e ".[INFERENCE_LIB]"

Install for inference APIs

AI Atlas Nexus uses Large Language Models (LLMs) to infer risks and risks data. Therefore, requires access to LLMs to inference or call the model. The following LLM inference APIs are supported:

IBM Watsonx AI (Watson Machine Learning)
Ollama
vLLM
RITS (IBM Internal Only)

IBM Watsonx AI (WML)

When using the WML platform, you need to:

Add configuration to .env file as follows. Please follow this documentation on obtaining WML credentials.

WML_API_KEY=<WML api key goes here>
WML_API_URL=<WML url key goes here>
WML_PROJECT_ID=<WML project id goes here, Optional>
WML_SPACE_ID=<WML space id goes here, Optional>

Either 'WML_PROJECT_ID' or 'WML_SPACE_ID' need to be specified.

Install WML dependencies as follows:

pip install -e ".[wml]"

Ollama

When using the Ollama inference, you need to:

Install Ollama dependencies as follows:

pip install -e ".[ollama]"

Please follow the quickstart guide to start Ollama LLM server. Server will start by default at http://localhost:11434
When selecting Ollama engine in AI Atlas Nexus, use the server address localhost:11434 as the api_url in the credentials or set the environment variable OLLAMA_API_URL with this value.

vLLM

When using the vLLM inference, you need to:

For Mac users, follow the instuctions here. Users need to build from the source vLLM to natively run on macOS.
For Linux users, install vLLM dependencies as follows:

pip install -e ".[vllm]"

Above package is enough to run vLLM in once-off offline mode. When selecting vLLM execution from AI Atlas Nexus, credentials should be passed as None to use vLLM offline mode.

(Optional) To run vLLM on an OpenAI-Compatible vLLM Server, execute the command:

vllm serve ibm-granite/granite-3.1-8b-instruct --max_model_len 4096 --host localhost --port 8000 --api-key <CUSTOM_API_KEY>

The CUSTOM_API_KEY can be any string that you choose to use as your API key. Above command will start vLLM server at http://localhost:8000. The server currently hosts one model at a time. Check all supported APIs at http://localhost:8000/docs

Note: When selecting vLLM engine in AI Atlas Nexus, pass api_url as host:port and given api_key to credentials with values from the vllm serve command above.

RITS (IBM Internal Only)

When using the RITS platform, you need to:

Add configuration to .env file as follows:

RITS_API_KEY=<RITS api key goes here>
RITS_API_URL=<RITS url key goes here>

Install RITS dependencies as follows:

pip install -e ".[rits]"

AI Atlas Nexus Extensions

Install AI Atlas Nexus extension using the below command

ran-extension install <EXTENSION_NAME>

Currently, following extensions are available

ran-ares-integration: ARES Integration for AI Atlas Nexus to run AI robustness evaluations on AI Systems derived from use cases.

Compatibility

View the releases changelog.

Referencing the project

If you use AI Atlas Nexus in your projects, please consider citing the following:

@article{airiskatlas2025,
      title={AI Risk Atlas: Taxonomy and Tooling for Navigating AI Risks and Resources},
      author={Frank Bagehorn and Kristina Brimijoin and Elizabeth M. Daly and Jessica He and Michael Hind and Luis Garces-Erice and Christopher Giblin and Ioana Giurgiu and Jacquelyn Martino and Rahul Nair and David Piorkowski and Ambrish Rawat and John Richards and Sean Rooney and Dhaval Salwala and Seshu Tirupathi and Peter Urbanetz and Kush R. Varshney and Inge Vejsbjerg and Mira L. Wolf-Bauwens},
      year={2025},
      eprint={2503.05780},
      archivePrefix={arXiv},
      primaryClass={cs.CY},
      url={https://arxiv.org/abs/2503.05780}
}

License

AI Atlas Nexus is provided under Apache 2.0 license.

Contributing

Get started by checking our contribution guidelines.
Read the wiki for more technical and design details.
If you have any questions, just ask!
Contribute your own taxonomy files and CoT templates

Tip: Use the makefile provided to regenerate artifacts provided in the repository by running make in this repository.

Find out more

Try out a quick demo at the HF spaces demo site
Read the publication AI Risk Atlas: Taxonomy and Tooling for Navigating AI Risks and Resources
Explore IBM's AI Risk Atlas on the IBM documentation site
View the demo projects repository showcasing implementations of AI Atlas Nexus.
Read the the IBM AI Ethics Board publication Foundation models: Opportunities, risks and mitigations which goes into more detail about the risk taxonomy, and describes the point of view of IBM on the ethics of foundation models.
'Usage Governance Advisor: From Intent to AI Governance' presents a system for semi-structured governance information, identifying and prioritising risks according to the intended use case, recommending appropriate benchmarks and risk assessments and proposing mitigation strategies and actions.

IBM ❤️ Open Source AI

AI Atlas Nexus has been brought to you by IBM.

For Tasks:

Click tags to check more tools for each tasks

identify risks explore ai tasks categorize risks auto-assist compliance detect potential risks

For Jobs:

ai researcher data scientist machine learning engineer ai governance specialist compliance analyst

Alternative AI tools for ai-atlas-nexus

Similar Open Source Tools

ai-atlas-nexus

github

: 114

compl-ai

COMPL-AI is a compliance-centered evaluation framework for LLMs created by ETH Zurich, INSAIT, and LatticeFlow AI. It includes a technical interpretation of the EU AI Act and an open-source benchmarking suite. The framework offers tailored benchmarks covering various technical aspects of the EU AI Act, a public Hugging Face leaderboard, and support for multiple providers. Users can run evaluations using a custom CLI tool and contribute to expanding benchmark coverage. The framework is undergoing updates to enhance coverage over the EU AI Act principles and technical requirements, with a focus on risk management, data quality, and cybersecurity measures.

github

: 179

spring-ai

The Spring AI project provides a Spring-friendly API and abstractions for developing AI applications. It offers a portable client API for interacting with generative AI models, enabling developers to easily swap out implementations and access various models like OpenAI, Azure OpenAI, and HuggingFace. Spring AI also supports prompt engineering, providing classes and interfaces for creating and parsing prompts, as well as incorporating proprietary data into generative AI without retraining the model. This is achieved through Retrieval Augmented Generation (RAG), which involves extracting, transforming, and loading data into a vector database for use by AI models. Spring AI's VectorStore abstraction allows for seamless transitions between different vector database implementations.

github

: 6.8k

radicalbit-ai-monitoring

The Radicalbit AI Monitoring Platform provides a comprehensive solution for monitoring Machine Learning and Large Language models in production. It helps proactively identify and address potential performance issues by analyzing data quality, model quality, and model drift. The repository contains files and projects for running the platform, including UI, API, SDK, and Spark components. Installation using Docker compose is provided, allowing deployment with a K3s cluster and interaction with a k9s container. The platform documentation includes a step-by-step guide for installation and creating dashboards. Community engagement is encouraged through a Discord server. The roadmap includes adding functionalities for batch and real-time workloads, covering various model types and tasks.

github

: 71

GhostOS

GhostOS is an AI Agent framework designed to replace JSON Schema with a Turing-complete code interaction interface (Moss Protocol). It aims to create intelligent entities capable of continuous learning and growth through code generation and project management. The framework supports various capabilities such as turning Python files into web agents, real-time voice conversation, body movements control, and emotion expression. GhostOS is still in early experimental development and focuses on out-of-the-box capabilities for AI agents.

github

: 58

hi-ml

The Microsoft Health Intelligence Machine Learning Toolbox is a repository that provides low-level and high-level building blocks for Machine Learning / AI researchers and practitioners. It simplifies and streamlines work on deep learning models for healthcare and life sciences by offering tested components such as data loaders, pre-processing tools, deep learning models, and cloud integration utilities. The repository includes two Python packages, 'hi-ml-azure' for helper functions in AzureML, 'hi-ml' for ML components, and 'hi-ml-cpath' for models and workflows related to histopathology images.

github

: 238

codebase-context-spec

The Codebase Context Specification (CCS) project aims to standardize embedding contextual information within codebases to enhance understanding for both AI and human developers. It introduces a convention similar to `.env` and `.editorconfig` files but focused on documenting code for both AI and humans. By providing structured contextual metadata, collaborative documentation guidelines, and standardized context files, developers can improve code comprehension, collaboration, and development efficiency. The project includes a linter for validating context files and provides guidelines for using the specification with AI assistants. Tooling recommendations suggest creating memory systems, IDE plugins, AI model integrations, and agents for context creation and utilization. Future directions include integration with existing documentation systems, dynamic context generation, and support for explicit context overriding.

github

: 75

iLLM-TSC

iLLM-TSC is a framework that integrates reinforcement learning and large language models for traffic signal control policy improvement. It refines RL decisions based on real-world contexts and provides reasonable actions when RL agents make erroneous decisions. The framework includes cases where the large language model provides explanations and recommendations for RL agent actions, such as prioritizing emergency vehicles at intersections. Users can install and run the framework locally to train RL models and evaluate the combined RL+LLM approach.

github

: 54

watchtower

AIShield Watchtower is a tool designed to fortify the security of AI/ML models and Jupyter notebooks by automating model and notebook discoveries, conducting vulnerability scans, and categorizing risks into 'low,' 'medium,' 'high,' and 'critical' levels. It supports scanning of public GitHub repositories, Hugging Face repositories, AWS S3 buckets, and local systems. The tool generates comprehensive reports, offers a user-friendly interface, and aligns with industry standards like OWASP, MITRE, and CWE. It aims to address the security blind spots surrounding Jupyter notebooks and AI models, providing organizations with a tailored approach to enhancing their security efforts.

github

: 187

AIF360

The AI Fairness 360 toolkit is an open-source library designed to detect and mitigate bias in machine learning models. It provides a comprehensive set of metrics, explanations, and algorithms for bias mitigation in various domains such as finance, healthcare, and education. The toolkit supports multiple bias mitigation algorithms and fairness metrics, and is available in both Python and R. Users can leverage the toolkit to ensure fairness in AI applications and contribute to its development for extensibility.

github

: 2.4k

BALROG

BALROG is a benchmark tool designed to evaluate agentic Long-Longitudinal Memory (LLM) and Vision-Language Memory (VLM) capabilities using reinforcement learning environments. It provides a comprehensive assessment of agentic abilities, supports both language and vision-language models, integrates with popular AI APIs, and allows for easy integration of custom agents, new environments, and models.

github

: 120

dlio_benchmark

DLIO is an I/O benchmark tool designed for Deep Learning applications. It emulates modern deep learning applications using Benchmark Runner, Data Generator, Format Handler, and I/O Profiler modules. Users can configure various I/O patterns, data loaders, data formats, datasets, and parameters. The tool is aimed at emulating the I/O behavior of deep learning applications and provides a modular design for flexibility and customization.

github

: 90

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

intelligence-toolkit

The Intelligence Toolkit is a suite of interactive workflows designed to help domain experts make sense of real-world data by identifying patterns, themes, relationships, and risks within complex datasets. It utilizes generative AI (GPT models) to create reports on findings of interest. The toolkit supports analysis of case, entity, and text data, providing various interactive workflows for different intelligence tasks. Users are expected to evaluate the quality of data insights and AI interpretations before taking action. The system is designed for moderate-sized datasets and responsible use of personal case data. It uses the GPT-4 model from OpenAI or Azure OpenAI APIs for generating reports and insights.

github

: 66

git-mcp

GitMCP is a free, open-source service that transforms any GitHub project into a remote Model Context Protocol (MCP) endpoint, allowing AI assistants to access project documentation effortlessly. It empowers AI with semantic search capabilities, requires zero setup, is completely free and private, and serves as a bridge between GitHub repositories and AI assistants.

github

: 320

pgai

pgai simplifies the process of building search and Retrieval Augmented Generation (RAG) AI applications with PostgreSQL. It brings embedding and generation AI models closer to the database, allowing users to create embeddings, retrieve LLM chat completions, reason over data for classification, summarization, and data enrichment directly from within PostgreSQL in a SQL query. The tool requires an OpenAI API key and a PostgreSQL client to enable AI functionality in the database. Users can install pgai from source, run it in a pre-built Docker container, or enable it in a Timescale Cloud service. The tool provides functions to handle API keys using psql or Python, and offers various AI functionalities like tokenizing, detokenizing, embedding, chat completion, and content moderation.

github

: 5.7k

For similar tasks

ai-atlas-nexus

github

: 114

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.9k

specification

OWASP CycloneDX is a full-stack Bill of Materials (BOM) standard that provides advanced supply chain capabilities for cyber risk reduction. The specification supports various types of Bill of Materials including Software, Hardware, Machine Learning, Cryptography, Manufacturing, and Operations. It also includes support for Vulnerability Disclosure Reports, Vulnerability Exploitability eXchange, and CycloneDX Attestations. CycloneDX helps organizations accurately inventory all components used in software development to identify risks, enhance transparency, and enable rapid impact analysis. The project is managed by the CycloneDX Core Working Group under the OWASP Foundation and is supported by the global information security community.

github

: 478

www-project-top-10-for-large-language-model-applications

The OWASP Top 10 for Large Language Model Applications is a standard awareness document for developers and web application security, providing practical, actionable, and concise security guidance for applications utilizing Large Language Model (LLM) technologies. The project aims to make application security visible and bridge the gap between general application security principles and the specific challenges posed by LLMs. It offers a comprehensive guide to navigate potential security risks in LLM applications, serving as a reference for both new and experienced developers and security professionals.

github

: 885

For similar jobs

responsible-ai-toolbox

Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment interfaces and libraries for understanding AI systems. It empowers developers and stakeholders to develop and monitor AI responsibly, enabling better data-driven actions. The toolbox includes visualization widgets for model assessment, error analysis, interpretability, fairness assessment, and mitigations library. It also offers a JupyterLab extension for managing machine learning experiments and a library for measuring gender bias in NLP datasets.

github

: 1.3k

cordum

Cordum is a control plane for AI agents designed to close the Trust Gap by providing safety, observability, and control features. It allows teams to deploy autonomous agents with built-in governance mechanisms, including safety policies, workflow orchestration, job routing, observability, and human-in-the-loop approvals. The tool aims to address the challenges of deploying AI agents in production by offering visibility, safety rails, audit trails, and approval mechanisms for sensitive operations.

github

: 448

ai-atlas-nexus

github

: 114

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 1.1k

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

github

: 405

PyRIT

github

: 2.9k