
mlir-aie
An MLIR-based toolchain for AMD AI Engine-enabled devices.
Stars: 478

This repository contains an MLIR-based toolchain for AI Engine-enabled devices, such as AMD Ryzen™ AI and Versal™. This repository can be used to generate low-level configurations for the AI Engine portion of these devices. AI Engines are organized as a spatial array of tiles, where each tile contains AI Engine cores and/or memories. The spatial array is connected by stream switches that can be configured to route data between AI Engine tiles scheduled by their programmable Data Movement Accelerators (DMAs). This repository contains MLIR representations, with multiple levels of abstraction, to target AI Engine devices. This enables compilers and developers to program AI Engine cores, as well as describe data movements and array connectivity. A Python API is made available as a convenient interface for generating MLIR design descriptions. Backend code generation is also included, targeting the aie-rt library. This toolchain uses the AI Engine compiler tool which is part of the AMD Vitis™ software installation: these tools require a free license for use from the Product Licensing Site.
README:
This project emphasizes fast, open-source toolchains for NPU devices including LLVM-based code generation. IRON contains a close-to-metal toolkit that empowers performance engineers to create fast and efficient designs for Ryzen™ AI NPUs powered by AI Engines. It provides Python APIs that enable developers to harness the unique architectural capabilities of AMD’s NPUs. However, this project is not intended to represent an end-to-end compilation flow for all application designs---it is designed to complement, not replace, mainstream NPU tooling for inference like the AMD Ryzen™ AI Software Platform. Targeting researchers and enthusiasts, IRON is designed to unlock the full potential of NPUs for a wide range of workloads, from machine learning to digital signal processing and beyond. This repository includes programming guides and examples demonstrating the APIs. Additionally, the Peano component extends the LLVM framework by adding support for the AI Engine processor as a target architecture, enabling integration with popular compiler frontends such as clang
. Developers can leverage the AIE API header library to implement efficient vectorized AIE core code in C++ that can be compiled by Peano.
This repository contains an MLIR-based toolchain for AI Engine-enabled devices, such as AMD Ryzen™ AI and Versal™. This repository can be used to generate low-level configurations for the AI Engine portion of these devices. AI Engines are organized as a spatial array of tiles, where each tile contains AI Engine cores and/or memories. The spatial array is connected by stream switches that can be configured to route data between AI Engine tiles scheduled by their programmable Data Movement Accelerators (DMAs). This repository contains MLIR representations, with multiple levels of abstraction, to target AI Engine devices. This enables compilers and developers to program AI Engine cores, as well as describe data movements and array connectivity.
The IRON Python API for Ryzen™ AI NPUs is described in the following paper:
E. Hunhoff, J. Melber, K. Denolf, A. Bisca, S. Bayliss, S. Neuendorffer, J. Fifield, J. Lo, P. Vasireddy, P. James-Roxby, E. Keller. "Efficiency, Expressivity, and Extensibility in a Close-to-Metal NPU Programming Interface". In 33rd IEEE International Symposium On Field-Programmable Custom Computing Machines, May 2025.
These instructions will guide you through everything required for building and executing a program on the Ryzen™ AI NPU, starting from a fresh bare-bones Ubuntu 24.04 or Ubuntu 24.10 install.
Be sure you have the latest BIOS on your laptop or mini-PC that enables the NPU. See here.
If starting from Ubuntu 24.04
you may need to update the Linux kernel to 6.11+ by installing the Hardware Enablement (HWE) stack:
sudo apt update
sudo apt install --install-recommends linux-generic-hwe-24.04
sudo reboot
Turn off SecureBoot (Allows for unsigned drivers to be installed):
BIOS → Security → Secure boot → Disable
-
Execute the scripted build process:
This script will install package dependencies, build the xdna-driver and xrt packages, and install them. These steps require
sudo
access.bash ./utils/build_drivers.sh
-
Reboot as directed after the script exits.
sudo reboot
-
Check that the NPU is working if the device appears with xrt-smi:
source /opt/xilinx/xrt/setup.sh xrt-smi examine
At the bottom of the output you should see:
Devices present BDF : Name ------------------------------------ [0000:66:00.1] : NPU Strix
-
Install the following packages needed for MLIR-AIE:
# Python versions 3.10, 3.12 and 3.13 are currently supported by our wheels sudo apt install \ build-essential clang clang-14 lld lld-14 cmake ninja-build python3-venv python3-pip
-
(Optional) Install opencv which is needed for vision programming examples:
sudo apt install libopencv-dev python3-opencv
-
Clone the mlir-aie repository:
git clone https://github.com/Xilinx/mlir-aie.git cd mlir-aie
-
Setup a virtual environment:
python3 -m venv ironenv source ironenv/bin/activate python3 -m pip install --upgrade pip
-
Install IRON library, mlir-aie and llvm-aie compilers from wheels and dependencies:
For release v1.0:
# Install IRON library and mlir-aie from a wheel python3 -m pip install mlir_aie -f https://github.com/Xilinx/mlir-aie/releases/expanded_assets/v1.0 # Install Peano from a llvm-aie wheel python3 -m pip install https://github.com/Xilinx/llvm-aie/releases/download/nightly/llvm_aie-19.0.0.2025041501+b2a279c1-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl # Install basic Python requirements (still needed for release v1.0, but is no longer needed for latest wheels) python3 -m pip install -r python/requirements.txt # Install MLIR Python Extras HOST_MLIR_PYTHON_PACKAGE_PREFIX=aie python3 -m pip install -r python/requirements_extras.txt
For daily latest:
# Install IRON library and mlir-aie from a wheel python3 -m pip install mlir_aie -f https://github.com/Xilinx/mlir-aie/releases/expanded_assets/latest-wheels-2 # Install Peano from llvm-aie wheel python3 -m pip install llvm-aie -f https://github.com/Xilinx/llvm-aie/releases/expanded_assets/nightly # Install MLIR Python Extras HOST_MLIR_PYTHON_PACKAGE_PREFIX=aie python3 -m pip install -r python/requirements_extras.txt
-
(Optional) Install Python packages required for development and testing:
# Install Python requirements for development and testing python3 -m pip install -r python/requirements_dev.txt # This installs the pre-commit hooks defined in .pre-commit-config.yaml pre-commit install
-
Setup environment
source utils/env_setup.sh
-
(Optional) Install ML Python packages for ml programming examples:
# Install Torch for ML examples python3 -m pip install -r python/requirements_ml.txt
-
(Optional) Install Jupyter Notebook Python packages:
# Install Jupyter Notebook python3 -m pip install -r python/requirements_notebook.txt # This creates an ipykernel (for use in notebooks) using the ironenv venv python3 -m ipykernel install --user --name ironenv # Only for Release v1.0 and non wheel-based installs: # The install generally captures in the $PYTHONPATH by the `env_setup.sh` script. # However, jupyter notebooks don't always get access to the PYTHONPATH (e.g., if they are run with # vscode) so we save the ${MLIR_AIE_INSTALL_DIR}/python in a .pth file in the site packages dir of the # ironenv venv; this allows the iron ipykernel to find the install dir regardless of if PYTHONPATH is # available or not. MLIR_AIE_INSTALL=`$(pip show mlir_aie | grep ^Location: | awk '{print $2}')/mlir_aie` \ venv_site_packages=`python3 -c 'import sysconfig; print(sysconfig.get_paths()["purelib"])'` \ echo ${MLIR_AIE_INSTALL}/python > $venv_site_packages/mlir-aie.pth
For your design of interest, for instance from programming_examples, 2 steps are needed: (i) build the AIE design and then (ii) build the host code.
-
Goto the design of interest and run:
make
-
Build host code and execute the design:
make run
-
Continue to the IRON AIE Application Programming Guide
-
Additional MLIR-AIE documentation is available on the website
-
AIE API header library documentation for single-core AIE programming in C++ is avaiable here
-
If you are a university researcher or student and interested in trying these tools on our Ryzen™ AI AUP Cloud systems, please contact the AMD University Program
You may skip the Vitis™ installation step if you intend to only target AMD XDNA™/AIE-ML (AIE2) and AMD XDNA™ 2 (AIE2P) using our open-source single-core compiler Peano. Compiling with
xchesscc
is not supported without installing AMD Vitis™ AIE Essentials.
-
Install Vitis™ AIE Essentials from Ryzen AI Software 1.3 Early Access. We will assume you use the installation directory,
/tools/ryzen_ai-1.3.0/vitis_aie_essentials
.This is an early access lounge, you must register and be granted access at this time.
-
Download VAIML Installer for Linux based compilation:
ryzen_ai-1.3.0ea1.tgz
-
Extract the required tools:
tar -xzvf ryzen_ai-1.3.0ea1.tgz cd ryzen_ai-1.3.0 mkdir vitis_aie_essentials mv vitis_aie_essentials*.whl vitis_aie_essentials cd vitis_aie_essentials unzip vitis_aie_essentials*.whl
-
-
Set up an AI Engine license.
-
Get a local license for AI Engine tools from https://www.xilinx.com/getlicense.
-
Copy your license file (Xilinx.lic) to your preferred location, e.g.
/opt/Xilinx.lic
:
-
-
Setup your environment using the following script for Vitis™ for AIETools:
#!/bin/bash ################################################################################# # Setup Vitis AIE Essentials ################################################################################# export AIETOOLS_ROOT=/tools/ryzen_ai-1.3.0/vitis_aie_essentials export PATH=$PATH:${AIETOOLS_ROOT}/bin export LM_LICENSE_FILE=/opt/Xilinx.lic
Be sure you have the latest BIOS for your laptop or mini PC, this will ensure the NPU (sometimes referred to as IPU) is enabled in the system. You may need to manually enable the NPU:
Advanced → CPU Configuration → IPU
NOTE: Some manufacturers only provide Windows executables to update the BIOS, please do this before installing Ubuntu.
IRON AIE Application Programming Guide
Building mlir-aie tools from source
MLIR Dialect and Compiler Documentation
Interested in contributing MLIR-AIE? Information for developers
Copyright© 2019-2024 Advanced Micro Devices, Inc
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for mlir-aie
Similar Open Source Tools

mlir-aie
This repository contains an MLIR-based toolchain for AI Engine-enabled devices, such as AMD Ryzen™ AI and Versal™. This repository can be used to generate low-level configurations for the AI Engine portion of these devices. AI Engines are organized as a spatial array of tiles, where each tile contains AI Engine cores and/or memories. The spatial array is connected by stream switches that can be configured to route data between AI Engine tiles scheduled by their programmable Data Movement Accelerators (DMAs). This repository contains MLIR representations, with multiple levels of abstraction, to target AI Engine devices. This enables compilers and developers to program AI Engine cores, as well as describe data movements and array connectivity. A Python API is made available as a convenient interface for generating MLIR design descriptions. Backend code generation is also included, targeting the aie-rt library. This toolchain uses the AI Engine compiler tool which is part of the AMD Vitis™ software installation: these tools require a free license for use from the Product Licensing Site.

kvcached
kvcached is a new KV cache management system that supports on-demand KV cache allocation. It implements the concept of GPU virtual memory, allowing applications to reserve virtual address space without immediately committing physical memory. Physical memory is then automatically allocated and mapped as needed at runtime. This capability allows multiple LLMs to run concurrently on a single GPU or a group of GPUs (TP) and flexibly share the GPU memory, significantly improving GPU utilization and reducing memory fragmentation. kvcached is compatible with popular LLM serving engines, including SGLang and vLLM.

JetStream
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome). It is designed to provide high performance and scalability for large language models, enabling efficient inference on cloud-based TPUs. JetStream leverages XLA to optimize the execution of LLM models, resulting in faster and more efficient inference. Additionally, JetStream supports quantization techniques to further enhance performance and reduce memory consumption. By utilizing JetStream, developers can deploy and run LLM models on TPUs with ease, achieving optimal performance and cost-effectiveness.

JetStream
JetStream is a throughput and memory optimized engine for Large Language Model (LLM) inference on XLA devices, specifically TPUs. It provides reference engine implementations for Jax and Pytorch models, along with documentation for online inference, serving Gemma using TPUs on GKE, benchmarking, observability, profiling, and standalone local setup. Users can easily set up a local server, run tests, and test core modules. JetStream aims to enhance the performance of LLM inference on XLA devices.

Bavarder
Bavarder is an AI-powered chit-chat tool designed for informal conversations about unimportant matters. Users can engage in light-hearted discussions with the AI, simulating casual chit-chat scenarios. The tool provides a platform for users to interact with AI in a fun and entertaining way, offering a unique experience of engaging with artificial intelligence in a conversational manner.

pianotrans
ByteDance's Piano Transcription is a PyTorch implementation for transcribing piano recordings into MIDI files with pedals. This repository provides a simple GUI and packaging for Windows and Nix on Linux/macOS. It supports using GPU for inference and includes CLI usage. Users can upgrade the tool and report issues to the upstream project. The tool focuses on providing MIDI files, and any other improvements to transcription results should be directed to the original project.

docetl
DocETL is a tool for creating and executing data processing pipelines, especially suited for complex document processing tasks. It offers a low-code, declarative YAML interface to define LLM-powered operations on complex data. Ideal for maximizing correctness and output quality for semantic processing on a collection of data, representing complex tasks via map-reduce, maximizing LLM accuracy, handling long documents, and automating task retries based on validation criteria.

SiLLM
SiLLM is a toolkit that simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework. It provides features such as LLM loading, LoRA training, DPO training, a web app for a seamless chat experience, an API server with OpenAI compatible chat endpoints, and command-line interface (CLI) scripts for chat, server, LoRA fine-tuning, DPO fine-tuning, conversion, and quantization.

copywriterproai-backend
CopywriterProAI is the world's first open-source AI writing platform for SEO and Ad Copy. The backend repository powers the AI capabilities and manages content processing for smooth operation. It provides an AI writing assistant that works behind the scenes to assist users in content creation.

shinkai-apps
Shinkai apps unlock the full capabilities/automation of first-class LLM (AI) support in the web browser. It enables creating multiple agents, each connected to either local or 3rd-party LLMs (ex. OpenAI GPT), which have permissioned (meaning secure) access to act in every webpage you visit. There is a companion repo called Shinkai Node, that allows you to set up the node anywhere as the central unit of the Shinkai Network, handling tasks such as agent management, job processing, and secure communications.

vim-airline
Vim-airline is a lean and mean status/tabline plugin for Vim that provides a nice statusline at the bottom of each Vim window. It consists of several sections displaying information such as mode, environment status, filename, filetype, file encoding, and current position in the file. The plugin is highly customizable and integrates with various plugins, providing a tiny core with extensibility in mind. It is optimized for speed, supports multiple themes, and integrates seamlessly with other plugins. Vim-airline is written in 100% Vimscript, eliminating the need for Python. The plugin aims to be stable and includes a unit testing suite for reliability.

OpenManus
OpenManus is an open-source project aiming to replicate the capabilities of the Manus AI agent, known for autonomously executing complex tasks like travel planning and stock analysis. The project provides a modular, containerized framework using Docker, Python, and JavaScript, allowing developers to build, deploy, and experiment with a multi-agent AI system. Features include collaborative AI agents, Dockerized environment, task execution support, tool integration, modular design, and community-driven development. Users can interact with OpenManus via CLI, API, or web UI, and the project welcomes contributions to enhance its capabilities.

evalverse
Evalverse is an open-source project designed to support Large Language Model (LLM) evaluation needs. It provides a standardized and user-friendly solution for processing and managing LLM evaluations, catering to AI research engineers and scientists. Evalverse supports various evaluation methods, insightful reports, and no-code evaluation processes. Users can access unified evaluation with submodules, request evaluations without code via Slack bot, and obtain comprehensive reports with scores, rankings, and visuals. The tool allows for easy comparison of scores across different models and swift addition of new evaluation tools.

java-sdk
The MCP Java SDK is a set of projects that provide Java SDK integration for the Model Context Protocol. It enables Java applications to interact with AI models and tools through a standardized interface, supporting both synchronous and asynchronous communication patterns.

SciPIP
SciPIP is a scientific paper idea generation tool powered by a large language model (LLM) designed to assist researchers in quickly generating novel research ideas. It conducts a literature review based on user-provided background information and generates fresh ideas for potential studies. The tool is designed to help researchers in various fields by providing a GUI environment for idea generation, supporting NLP, multimodal, and CV fields, and allowing users to interact with the tool through a web app or terminal. SciPIP uses Neo4j as its database and provides functionalities for generating new ideas, fetching papers, and constructing the database.

spatz
Spatz is a complete, fullstack template for Svelte that includes features such as Sveltekit for building fast web apps, Pocketbase for User Auth and Database, OpenAI for chatbots, Vercel AI SDK for AI/ML models, TailwindCSS for UI development, DaisyUI for components, and Zod for schema declaration and validation. The template provides a structured project setup with components, stores, routes, and APIs. It also offers theming and styling options with pre-loaded themes from DaisyUI. Contributions are welcomed through feature requests or pull requests.
For similar tasks

mlir-aie
This repository contains an MLIR-based toolchain for AI Engine-enabled devices, such as AMD Ryzen™ AI and Versal™. This repository can be used to generate low-level configurations for the AI Engine portion of these devices. AI Engines are organized as a spatial array of tiles, where each tile contains AI Engine cores and/or memories. The spatial array is connected by stream switches that can be configured to route data between AI Engine tiles scheduled by their programmable Data Movement Accelerators (DMAs). This repository contains MLIR representations, with multiple levels of abstraction, to target AI Engine devices. This enables compilers and developers to program AI Engine cores, as well as describe data movements and array connectivity. A Python API is made available as a convenient interface for generating MLIR design descriptions. Backend code generation is also included, targeting the aie-rt library. This toolchain uses the AI Engine compiler tool which is part of the AMD Vitis™ software installation: these tools require a free license for use from the Product Licensing Site.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.