
mlir-aie
An MLIR-based toolchain for AMD AI Engine-enabled devices.
Stars: 343

This repository contains an MLIR-based toolchain for AI Engine-enabled devices, such as AMD Ryzen™ AI and Versal™. This repository can be used to generate low-level configurations for the AI Engine portion of these devices. AI Engines are organized as a spatial array of tiles, where each tile contains AI Engine cores and/or memories. The spatial array is connected by stream switches that can be configured to route data between AI Engine tiles scheduled by their programmable Data Movement Accelerators (DMAs). This repository contains MLIR representations, with multiple levels of abstraction, to target AI Engine devices. This enables compilers and developers to program AI Engine cores, as well as describe data movements and array connectivity. A Python API is made available as a convenient interface for generating MLIR design descriptions. Backend code generation is also included, targeting the aie-rt library. This toolchain uses the AI Engine compiler tool which is part of the AMD Vitis™ software installation: these tools require a free license for use from the Product Licensing Site.
README:
This repository contains an MLIR-based toolchain for AI Engine-enabled devices, such as AMD Ryzen™ AI and Versal™. This repository can be used to generate low-level configurations for the AI Engine portion of these devices. AI Engines are organized as a spatial array of tiles, where each tile contains AI Engine cores and/or memories. The spatial array is connected by stream switches that can be configured to route data between AI Engine tiles scheduled by their programmable Data Movement Accelerators (DMAs). This repository contains MLIR representations, with multiple levels of abstraction, to target AI Engine devices. This enables compilers and developers to program AI Engine cores, as well as describe data movements and array connectivity. A Python API is made available as a convenient interface for generating MLIR design descriptions. Backend code generation is also included, targeting the aie-rt library.
This project is primarily intended to support the open-source community, particularly tool builders, with low-level access to AIE devices and enable the development of a wide variety of programming models from higher level abstractions. We provide an example programming flow: Interface Representation for hands-ON (IRON) close-to-metal programming of the AIE-array. IRON is an open access toolkit enabling performance engineers to build fast and efficient, often specialized designs through a set of Python language bindings around the mlir-aie dialect. As such, it contains some examples, however this project is not intended to represent an end-to-end compilation flow for all application designs. If you're looking for an out-of-the-box experience for highly efficient machine learning, check out the AMD Ryzen™ AI Software Platform.
These instructions will guide you through everything required for building and executing a program on the Ryzen™ AI NPU, starting from a fresh bare-bones Ubuntu 24.04.2 or Ubuntu 24.10 install with included Linux 6.11 kernel.
Be sure you have the latest BIOS on your laptop or mini-PC that enables the NPU. See here.
Turn off SecureBoot (Allows for unsigned drivers to be installed):
BIOS → Security → Secure boot → Disable
-
Clone the XDNA™ driver repository and its submodules.
git clone https://github.com/amd/xdna-driver.git export XDNA_SRC_DIR=$(realpath xdna-driver) cd xdna-driver git submodule update --init --recursive
The submodules use SSH remotes. You will need a GitHub account and locally installed SSH keys to pull the submodules. Follow these instructions to set up an SSH key. Alternatively, edit
.gitmodules
to use HTTPS instead of SSH. -
Install XRT. (Below steps are adapted from here.)
-
Install XRT prerequisites.
cd $XDNA_SRC_DIR sudo ./tools/amdxdna_deps.sh
-
Build XRT.
cd $XDNA_SRC_DIR/xrt/build ./build.sh -npu -opt
-
Install XRT.
cd $XDNA_SRC_DIR/xrt/build/Release sudo apt reinstall ./xrt_202510.2.19.0_24.10-amd64-npu.deb
An error might occur during this proces. If so, do the following steps.
cd $XDNA_SRC_DIR/xrt/build/Release sudo apt remove xrt-npu sudo dpkg -i --force-overwrite ./xrt_202510.2.19.0_24.10-amd64-npu.deb sudo apt -f install sudo apt reinstall ./xrt_202510.2.19.0_24.10-amd64-npu.deb
-
-
Build XDNA-Driver. Below steps are adapted from here.
cd $XDNA_SRC_DIR/build ./build.sh -release ./build.sh -package
-
Install XDNA™.
cd $XDNA_SRC_DIR/build/Release sudo apt reinstall ./xrt_plugin.2.19.0_ubuntu24.10-x86_64-amdxdna.deb
-
Check that the NPU is working if the device appears with xrt-smi:
source /opt/xilinx/xrt/setup.sh xrt-smi examine
At the bottom of the output you should see:
Devices present BDF : Name ------------------------------------ [0000:66:00.1] : RyzenAI-npu1
-
Install the following packages needed for MLIR-AIE:
sudo apt install \ build-essential clang clang-14 lld lld-14 cmake python3-venv python3-pip libxrender1 libxtst6 libxi6
-
Install g++13 and opencv which is needed for some programming examples:
sudo add-apt-repository ppa:ubuntu-toolchain-r/test sudo apt update sudo apt install gcc-13 g++-13 -y sudo apt install libopencv-dev python3-opencv
-
Clone the mlir-aie repository:
git clone https://github.com/Xilinx/mlir-aie.git cd mlir-aie
-
Source
utils/quick_setup.sh
to setup the prerequisites and install the mlir-aie compiler tools from whls.
Remember to set up your environment including IRON, and XRT
source /opt/xilinx/xrt/setup.sh source ironenv/bin/activate source utils/env_setup.sh
For your design of interest, for instance from programming_examples, 2 steps are needed: (i) build the AIE design and then (ii) build the host code.
- Goto the design of interest and run
make
-
Build: Goto the same design of interest folder where the AIE design just was built (see above)
make <testName>.exe
Note that the host code target has a
.exe
file extension even on Linux. Although unusual, this is an easy way for us to distinguish whether we want to compile device code or host code. -
Run (program arguments are just an example for vector_scalar_add design)
make run
-
Continue to the IRON AIE Application Programming Guide
-
Some MLIR-AIE documentation is available on the website
You may skip the Vitis™ installation step if you intend to only target AMD XDNA™/AIE-ML (AIE2) and AMD XDNA™ 2 (AIE2P) using our open-source single-core compiler Peano. Compiling with
xchesscc
is not supported without installing AMD Vitis™ AIE Essentials.
-
Install Vitis™ AIE Essentials from Ryzen AI Software 1.3 Early Accesss. We will assume you use the installation directory,
/tools/ryzen_ai-1.3.0/vitis_aie_essentials
.This is an early access lounge, you must register and be granted access at this time.
-
Download VAIML Installer for Linux based compilation:
ryzen_ai-1.3.0ea1.tgz
-
Extract the required tools:
tar -xzvf ryzen_ai-1.3.0ea1.tgz cd ryzen_ai-1.3.0 mkdir vitis_aie_essentials mv vitis_aie_essentials*.whl vitis_aie_essentials cd vitis_aie_essentials unzip vitis_aie_essentials*.whl
-
-
Set up an AI Engine license.
-
Get a local license for AI Engine tools from https://www.xilinx.com/getlicense.
-
Copy your license file (Xilinx.lic) to your preferred location, e.g.
/opt/Xilinx.lic
:
-
-
Setup your environment using the following script for Vitis™ for AIETools:
#!/bin/bash ################################################################################# # Setup Vitis AIE Essentials ################################################################################# export AIETOOLS_ROOT=/tools/ryzen_ai-1.3.0/vitis_aie_essentials export PATH=$PATH:${AIETOOLS_ROOT}/bin export LM_LICENSE_FILE=/opt/Xilinx.lic
Be sure you have the latest BIOS for your laptop or mini PC, this will ensure the NPU (sometimes referred to as IPU) is enabled in the system. You may need to manually enable the NPU:
Advanced → CPU Configuration → IPU
NOTE: Some manufacturers only provide Windows executables to update the BIOS, please do this before installing Ubuntu.
Getting Started on a Versal™ board
Getting Started and Running on Windows Ryzen™ AI
Getting Started and Running on Linux Ryzen™ AI
IRON AIE Application Programming Guide
MLIR Dialect and Compiler Documentation
Interested in contributing MLIR-AIE? Information for developers
Copyright© 2019-2024 Advanced Micro Devices, Inc
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for mlir-aie
Similar Open Source Tools

mlir-aie
This repository contains an MLIR-based toolchain for AI Engine-enabled devices, such as AMD Ryzen™ AI and Versal™. This repository can be used to generate low-level configurations for the AI Engine portion of these devices. AI Engines are organized as a spatial array of tiles, where each tile contains AI Engine cores and/or memories. The spatial array is connected by stream switches that can be configured to route data between AI Engine tiles scheduled by their programmable Data Movement Accelerators (DMAs). This repository contains MLIR representations, with multiple levels of abstraction, to target AI Engine devices. This enables compilers and developers to program AI Engine cores, as well as describe data movements and array connectivity. A Python API is made available as a convenient interface for generating MLIR design descriptions. Backend code generation is also included, targeting the aie-rt library. This toolchain uses the AI Engine compiler tool which is part of the AMD Vitis™ software installation: these tools require a free license for use from the Product Licensing Site.

flock
Flock is a workflow-based low-code platform that enables rapid development of chatbots, RAG applications, and coordination of multi-agent teams. It offers a flexible, low-code solution for orchestrating collaborative agents, supporting various node types for specific tasks, such as input processing, text generation, knowledge retrieval, tool execution, intent recognition, answer generation, and more. Flock integrates LangChain and LangGraph to provide offline operation capabilities and supports future nodes like Conditional Branch, File Upload, and Parameter Extraction for creating complex workflows. Inspired by StreetLamb, Lobe-chat, Dify, and fastgpt projects, Flock introduces new features and directions while leveraging open-source models and multi-tenancy support.

JetStream
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome). It is designed to provide high performance and scalability for large language models, enabling efficient inference on cloud-based TPUs. JetStream leverages XLA to optimize the execution of LLM models, resulting in faster and more efficient inference. Additionally, JetStream supports quantization techniques to further enhance performance and reduce memory consumption. By utilizing JetStream, developers can deploy and run LLM models on TPUs with ease, achieving optimal performance and cost-effectiveness.

OmniSteward
OmniSteward is an AI-powered steward system based on large language models that can interact with users through voice or text to help control smart home devices and computer programs. It supports multi-turn dialogue, tool calling for complex tasks, multiple LLM models, voice recognition, smart home control, computer program management, online information retrieval, command line operations, and file management. The system is highly extensible, allowing users to customize and share their own tools.

aituber-server
AITuberKit server-side is a tool that allows users to receive messages via WebSocket and obtain responses from Open Interpreter. Users can also send files to the server for storage and issue commands to Open Interpreter. The tool is designed for WebSocket operation and provides a default connection URL of `ws://127.0.0.1:8000/ws`. It supports debugging in VSCode with DEBUG_MODE=1. The tool is licensed under KillianLucas/open-interpreter and includes a guide on how to use Open Interpreter.

copywriterproai-backend
CopywriterProAI is the world's first open-source AI writing platform for SEO and Ad Copy. The backend repository powers the AI capabilities and manages content processing for smooth operation. It provides an AI writing assistant that works behind the scenes to assist users in content creation.

aiaio
aiaio (AI-AI-O) is a lightweight, privacy-focused web UI for interacting with AI models. It supports both local and remote LLM deployments through OpenAI-compatible APIs. The tool provides features such as dark/light mode support, local SQLite database for conversation storage, file upload and processing, configurable model parameters through UI, privacy-focused design, responsive design for mobile/desktop, syntax highlighting for code blocks, real-time conversation updates, automatic conversation summarization, customizable system prompts, WebSocket support for real-time updates, Docker support for deployment, multiple API endpoint support, and multiple system prompt support. Users can configure model parameters and API settings through the UI, handle file uploads, manage conversations, and use keyboard shortcuts for efficient interaction. The tool uses SQLite for storage with tables for conversations, messages, attachments, and settings. Contributions to the project are welcome under the Apache License 2.0.

julep
Julep is an advanced platform for creating stateful and functional AI apps powered by large language models. It offers features like statefulness by design, automatic function calling, production-ready deployment, cron-like asynchronous functions, 90+ built-in tools, and the ability to switch between different LLMs easily. Users can build AI applications without the need to write code for embedding, saving, and retrieving conversation history, and can connect to third-party applications using Composio. Julep simplifies the process of getting started with AI apps, whether they are conversational, functional, or agentic.

codefuse-ide
CodeFuse IDE is an AI-native integrated development environment that leverages AI technologies to enhance productivity and streamline workflows. It supports seamless integration of various models, enabling developers to customize and extend functionality. The platform is compatible with VS Code extensions, providing access to a rich ecosystem of plugins. CodeFuse IDE uses electron-forge for packaging desktop applications and supports development, building, packaging, and auto updates.

llmaz
llmaz is an easy, advanced inference platform for large language models on Kubernetes. It aims to provide a production-ready solution that integrates with state-of-the-art inference backends. The platform supports efficient model distribution, accelerator fungibility, SOTA inference, various model providers, multi-host support, and scaling efficiency. Users can quickly deploy LLM services with minimal configurations and benefit from a wide range of advanced inference backends. llmaz is designed to optimize cost and performance while supporting cutting-edge researches like Speculative Decoding or Splitwise on Kubernetes.

aiode
aiode is a Discord bot that plays Spotify tracks and YouTube videos or any URL including Soundcloud links and Twitch streams. It allows users to create cross-platform playlists, customize player commands, create custom command presets, adjust properties for deeper customization, sign in to Spotify to play personal playlists, manage access permissions for commands, customize bot summoning methods, and execute advanced admin commands. The bot also features a scripting sandbox for running and storing custom groovy scripts and modifying command behavior through interceptors.

minimal-chat
MinimalChat is a minimal and lightweight open-source chat application with full mobile PWA support that allows users to interact with various language models, including GPT-4 Omni, Claude Opus, and various Local/Custom Model Endpoints. It focuses on simplicity in setup and usage while being fully featured and highly responsive. The application supports features like fully voiced conversational interactions, multiple language models, markdown support, code syntax highlighting, DALL-E 3 integration, conversation importing/exporting, and responsive layout for mobile use.

Apex-Aim-Panel-v2.31.23.4
Apex-Aim-Panel-v2.31.23.4 is a cheat tool designed for Apex Legends that provides features like Aimbot, ESP, and Misc functionalities. Users can safely ignore antivirus triggers and follow the provided instructions to run the software, enabling them to enhance their gameplay experience in Apex Legends.

code-interpreter
This Code Interpreter SDK allows you to run AI-generated Python code and each run share the context. That means that subsequent runs can reference to variables, definitions, etc from past code execution runs. The code interpreter runs inside the E2B Sandbox - an open-source secure micro VM made for running untrusted AI-generated code and AI agents. - ✅ Works with any LLM and AI framework - ✅ Supports streaming content like charts and stdout, stderr - ✅ Python & JS SDK - ✅ Runs on serverless and edge functions - ✅ 100% open source (including infrastructure)

pianotrans
ByteDance's Piano Transcription is a PyTorch implementation for transcribing piano recordings into MIDI files with pedals. This repository provides a simple GUI and packaging for Windows and Nix on Linux/macOS. It supports using GPU for inference and includes CLI usage. Users can upgrade the tool and report issues to the upstream project. The tool focuses on providing MIDI files, and any other improvements to transcription results should be directed to the original project.

webapp-starter
webapp-starter is a modern full-stack application template built with Turborepo, featuring a Hono + Bun API backend and Next.js frontend. It provides an easy way to build a SaaS product. The backend utilizes technologies like Bun, Drizzle ORM, and Supabase, while the frontend is built with Next.js, Tailwind CSS, Shadcn/ui, and Clerk. Deployment can be done using Vercel and Render. The project structure includes separate directories for API backend and Next.js frontend, along with shared packages for the main database. Setup involves installing dependencies, configuring environment variables, and setting up services like Bun, Supabase, and Clerk. Development can be done using 'turbo dev' command, and deployment instructions are provided for Vercel and Render. Contributions are welcome through pull requests.
For similar tasks

mlir-aie
This repository contains an MLIR-based toolchain for AI Engine-enabled devices, such as AMD Ryzen™ AI and Versal™. This repository can be used to generate low-level configurations for the AI Engine portion of these devices. AI Engines are organized as a spatial array of tiles, where each tile contains AI Engine cores and/or memories. The spatial array is connected by stream switches that can be configured to route data between AI Engine tiles scheduled by their programmable Data Movement Accelerators (DMAs). This repository contains MLIR representations, with multiple levels of abstraction, to target AI Engine devices. This enables compilers and developers to program AI Engine cores, as well as describe data movements and array connectivity. A Python API is made available as a convenient interface for generating MLIR design descriptions. Backend code generation is also included, targeting the aie-rt library. This toolchain uses the AI Engine compiler tool which is part of the AMD Vitis™ software installation: these tools require a free license for use from the Product Licensing Site.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.