aim

Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.

Stars: 5400

Visit

Aim is an open-source, self-hosted ML experiment tracking tool designed to handle 10,000s of training runs. Aim provides a performant and beautiful UI for exploring and comparing training runs. Additionally, its SDK enables programmatic access to tracked metadata — perfect for automations and Jupyter Notebook analysis. **Aim's mission is to democratize AI dev tools 🎯**

README:

Drop a star to support Aim ⭐

Join Aim discord community

An easy-to-use & supercharged open-source experiment tracker

Aim logs your training runs and any AI Metadata, enables a beautiful UI to compare, observe them and an API to query them programmatically.

_{SEAMLESSLY INTEGRATES WITH:}

_{TRUSTED BY ML TEAMS FROM:}

AimStack offers enterprise support that's beyond core Aim. Contact via [email protected] e-mail.

About • Demos • Ecosystem • Quick Start • Examples • Documentation • Community • Blog

ℹ️ About

Aim is an open-source, self-hosted ML experiment tracking tool designed to handle 10,000s of training runs.

Aim provides a performant and beautiful UI for exploring and comparing training runs. Additionally, its SDK enables programmatic access to tracked metadata — perfect for automations and Jupyter Notebook analysis.

Aim's mission is to democratize AI dev tools 🎯

Log Metadata Across Your ML Pipeline 💾	Visualize & Compare Metadata via UI 📊
ML experiments and any metadata tracking Integration with popular ML frameworks Easy migration from other experiment trackers	Metadata visualization via Aim Explorers Grouping and aggregation Querying using Python expressions
Run ML Trainings Effectively ⚡	Organize Your Experiments 🗂️
System info and resource usage tracking Real-time alerting on training progress Logging and configurable notifications	Detailed run information for easy debugging Centralized dashboard for holistic view Runs grouping with tags and experiments

🎬 Demos

Check out live Aim demos NOW to see it in action.

Machine translation experiments	lightweight-GAN experiments

Training logs of a neural translation model(from WMT'19 competition).	Training logs of 'lightweight' GAN, proposed in ICLR 2021.

FastSpeech 2 experiments	Simple MNIST

Training logs of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech".	Simple MNIST training logs.

🌍 Ecosystem

Aim is not just an experiment tracker. It's a groundwork for an ecosystem. Check out the two most famous Aim-based tools.

aimlflow	Aim-spaCy

Exploring MLflow experiments with a powerful UI	an Aim-based spaCy experiment tracker

🏁 Quick start

Follow the steps below to get started with Aim.

1. Install Aim on your training environment

pip3 install aim

2. Integrate Aim with your code

from aim import Run

# Initialize a new run
run = Run()

# Log run parameters
run["hparams"] = {
    "learning_rate": 0.001,
    "batch_size": 32,
}

# Log metrics
for i in range(10):
    run.track(i, name='loss', step=i, context={ "subset":"train" })
    run.track(i, name='acc', step=i, context={ "subset":"train" })

See the full list of supported trackable objects(e.g. images, text, etc) here.

3. Run the training as usual and start Aim UI

aim up

Learn more

Migrate from other tools

Aim has built-in converters to easily migrate logs from other tools. These migrations cover the most common usage scenarios. In case of custom and complex scenarios you can use Aim SDK to implement your own conversion script.

Integrate Aim into an existing project

Aim easily integrates with a wide range of ML frameworks, providing built-in callbacks for most of them.

Query runs programmatically via SDK

Aim Python SDK empowers you to query and access any piece of tracked metadata with ease.

from aim import Repo

my_repo = Repo('/path/to/aim/repo')

query = "metric.name == 'loss'" # Example query

# Get collection of metrics
for run_metrics_collection in my_repo.query_metrics(query).iter_runs():
    for metric in run_metrics_collection:
        # Get run params
        params = metric.run[...]
        # Get metric values
        steps, metric_values = metric.values.sparse_numpy()

Set up a centralized tracking server

Aim remote tracking server allows running experiments in a multi-host environment and collect tracked data in a centralized location.

See the docs on how to set up the remote server.

Deploy Aim on kubernetes

The official Aim docker image: https://hub.docker.com/r/aimstack/aim
A guide on how to deploy Aim on kubernetes: https://aimstack.readthedocs.io/en/latest/using/k8s_deployment.html

Read the full documentation on aimstack.readthedocs.io 📖

🆚 Comparisons to familiar tools

TensorBoard vs Aim

Training run comparison

Order of magnitude faster training run comparison with Aim

The tracked params are first class citizens at Aim. You can search, group, aggregate via params - deeply explore all the tracked data (metrics, params, images) on the UI.
With tensorboard the users are forced to record those parameters in the training run name to be able to search and compare. This causes a super-tedius comparison experience and usability issues on the UI when there are many experiments and params. TensorBoard doesn't have features to group, aggregate the metrics

Scalability

Aim is built to handle 1000s of training runs - both on the backend and on the UI.
TensorBoard becomes really slow and hard to use when a few hundred training runs are queried / compared.

Beloved TB visualizations to be added on Aim

Embedding projector.
Neural network visualization.

MLflow vs Aim

MLFlow is an end-to-end ML Lifecycle tool. Aim is focused on training tracking. The main differences of Aim and MLflow are around the UI scalability and run comparison features.

Aim and MLflow are a perfect match - check out the aimlflow - the tool that enables Aim superpowers on Mlflow.

Run comparison

Aim treats tracked parameters as first-class citizens. Users can query runs, metrics, images and filter using the params.
MLFlow does have a search by tracked config, but there are no grouping, aggregation, subplotting by hyparparams and other comparison features available.

UI Scalability

Aim UI can handle several thousands of metrics at the same time smoothly with 1000s of steps. It may get shaky when you explore 1000s of metrics with 10000s of steps each. But we are constantly optimizing!
MLflow UI becomes slow to use when there are a few hundreds of runs.

Weights and Biases vs Aim

Hosted vs self-hosted

Weights and Biases is a hosted closed-source MLOps platform.
Aim is self-hosted, free and open-source experiment tracking tool.

🛣️ Roadmap

Detailed milestones

The Aim product roadmap ❇️

The Backlog contains the issues we are going to choose from and prioritize weekly
The issues are mainly prioritized by the highly-requested features

High-level roadmap

The high-level features we are going to work on the next few months:

In progress

[ ] Aim SDK low-level interface
[ ] Dashboards – customizable layouts with embedded explorers
[ ] Ergonomic UI kit
[ ] Text Explorer

Next-up

Aim UI

Runs management
- Runs explorer – query and visualize runs data(images, audio, distributions, ...) in a central dashboard
Explorers
- Distributions Explorer

SDK and Storage

Scalability
- Smooth UI and SDK experience with over 10.000 runs
Runs management
- CLI commands
  - Reporting - runs summary and run details in a CLI compatible format
  - Manipulations – copy, move, delete runs, params and sequences
Cloud storage support – store runs blob(e.g. images) data on the cloud
Artifact storage – store files, model checkpoints, and beyond

Integrations

ML Frameworks:
- Shortlist: scikit-learn
Resource management tools
- Shortlist: Kubeflow, Slurm
Workflow orchestration tools

Done

[x] Live updates (Shipped: Oct 18 2021)
[x] Images tracking and visualization (Start: Oct 18 2021, Shipped: Nov 19 2021)
[x] Distributions tracking and visualization (Start: Nov 10 2021, Shipped: Dec 3 2021)
[x] Jupyter integration (Start: Nov 18 2021, Shipped: Dec 3 2021)
[x] Audio tracking and visualization (Start: Dec 6 2021, Shipped: Dec 17 2021)
[x] Transcripts tracking and visualization (Start: Dec 6 2021, Shipped: Dec 17 2021)
[x] Plotly integration (Start: Dec 1 2021, Shipped: Dec 17 2021)
[x] Colab integration (Start: Nov 18 2021, Shipped: Dec 17 2021)
[x] Centralized tracking server (Start: Oct 18 2021, Shipped: Jan 22 2022)
[x] Tensorboard adaptor - visualize TensorBoard logs with Aim (Start: Dec 17 2021, Shipped: Feb 3 2022)
[x] Track git info, env vars, CLI arguments, dependencies (Start: Jan 17 2022, Shipped: Feb 3 2022)
[x] MLFlow adaptor (visualize MLflow logs with Aim) (Start: Feb 14 2022, Shipped: Feb 22 2022)
[x] Activeloop Hub integration (Start: Feb 14 2022, Shipped: Feb 22 2022)
[x] PyTorch-Ignite integration (Start: Feb 14 2022, Shipped: Feb 22 2022)
[x] Run summary and overview info(system params, CLI args, git info, ...) (Start: Feb 14 2022, Shipped: Mar 9 2022)
[x] Add DVC related metadata into aim run (Start: Mar 7 2022, Shipped: Mar 26 2022)
[x] Ability to attach notes to Run from UI (Start: Mar 7 2022, Shipped: Apr 29 2022)
[x] Fairseq integration (Start: Mar 27 2022, Shipped: Mar 29 2022)
[x] LightGBM integration (Start: Apr 14 2022, Shipped: May 17 2022)
[x] CatBoost integration (Start: Apr 20 2022, Shipped: May 17 2022)
[x] Run execution details(display stdout/stderr logs) (Start: Apr 25 2022, Shipped: May 17 2022)
[x] Long sequences(up to 5M of steps) support (Start: Apr 25 2022, Shipped: Jun 22 2022)
[x] Figures Explorer (Start: Mar 1 2022, Shipped: Aug 21 2022)
[x] Notify on stuck runs (Start: Jul 22 2022, Shipped: Aug 21 2022)
[x] Integration with KerasTuner (Start: Aug 10 2022, Shipped: Aug 21 2022)
[x] Integration with WandB (Start: Aug 15 2022, Shipped: Aug 21 2022)
[x] Stable remote tracking server (Start: Jun 15 2022, Shipped: Aug 21 2022)
[x] Integration with fast.ai (Start: Aug 22 2022, Shipped: Oct 6 2022)
[x] Integration with MXNet (Start: Sep 20 2022, Shipped: Oct 6 2022)
[x] Project overview page (Start: Sep 1 2022, Shipped: Oct 6 2022)
[x] Remote tracking server scaling (Start: Sep 11 2022, Shipped: Nov 26 2022)
[x] Integration with PaddlePaddle (Start: Oct 2 2022, Shipped: Nov 26 2022)
[x] Integration with Optuna (Start: Oct 2 2022, Shipped: Nov 26 2022)
[x] Audios Explorer (Start: Oct 30 2022, Shipped: Nov 26 2022)
[x] Experiment page (Start: Nov 9 2022, Shipped: Nov 26 2022)
[x] HuggingFace datasets (Start: Dec 29 2022, Feb 3 2023)

👥 Community

Aim README badge

Add Aim badge to your README, if you've enjoyed using Aim in your work:

[![Aim](https://img.shields.io/badge/powered%20by-Aim-%231473E6)](https://github.com/aimhubio/aim)

Cite Aim in your papers

In case you've found Aim helpful in your research journey, we'd be thrilled if you could acknowledge Aim's contribution:

@software{Arakelyan_Aim_2020,
  author = {Arakelyan, Gor and Soghomonyan, Gevorg and {The Aim team}},
  doi = {10.5281/zenodo.6536395},
  license = {Apache-2.0},
  month = {6},
  title = {{Aim}},
  url = {https://github.com/aimhubio/aim},
  version = {3.9.3},
  year = {2020}
}

Contributing to Aim

Considering contibuting to Aim? To get started, please take a moment to read the CONTRIBUTING.md guide.

Join Aim contributors by submitting your first pull request. Happy coding! 😊

Made with contrib.rocks.

For Tasks:

Click tags to check more tools for each tasks

track training runs compare training runs explore training data

For Jobs:

data scientist machine learning engineer research scientist data analyst software engineer

Alternative AI tools for aim

Similar Open Source Tools

aim

github

: 5.4k

superagentx

SuperAgentX is a lightweight open-source AI framework designed for multi-agent applications with Artificial General Intelligence (AGI) capabilities. It offers goal-oriented multi-agents with retry mechanisms, easy deployment through WebSocket, RESTful API, and IO console interfaces, streamlined architecture with no major dependencies, contextual memory using SQL + Vector databases, flexible LLM configuration supporting various Gen AI models, and extendable handlers for integration with diverse APIs and data sources. It aims to accelerate the development of AGI by providing a powerful platform for building autonomous AI agents capable of executing complex tasks with minimal human intervention.

github

: 57

kan-gpt

The KAN-GPT repository is a PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling. It provides a model for generating text based on prompts, with a focus on improving performance compared to traditional MLP-GPT models. The repository includes scripts for training the model, downloading datasets, and evaluating model performance. Development tasks include integrating with other libraries, testing, and documentation.

github

: 663

ColossalAI

Colossal-AI is a deep learning system for large-scale parallel training. It provides a unified interface to scale sequential code of model training to distributed environments. Colossal-AI supports parallel training methods such as data, pipeline, tensor, and sequence parallelism and is integrated with heterogeneous training and zero redundancy optimizer.

github

: 40.3k

omi

Omi is an open-source AI wearable that provides automatic, high-quality transcriptions of meetings, chats, and voice memos. It revolutionizes how conversations are captured and managed by connecting to mobile devices. The tool offers features for seamless documentation and integration with third-party services.

github

: 4.5k

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLM, developed by the MMRazor and MMDeploy teams. It has the following core features: * **Efficient Inference** : LMDeploy delivers up to 1.8x higher request throughput than vLLM, by introducing key features like persistent batch(a.k.a. continuous batching), blocked KV cache, dynamic split&fuse, tensor parallelism, high-performance CUDA kernels and so on. * **Effective Quantization** : LMDeploy supports weight-only and k/v quantization, and the 4-bit inference performance is 2.4x higher than FP16. The quantization quality has been confirmed via OpenCompass evaluation. * **Effortless Distribution Server** : Leveraging the request distribution service, LMDeploy facilitates an easy and efficient deployment of multi-model services across multiple machines and cards. * **Interactive Inference Mode** : By caching the k/v of attention during multi-round dialogue processes, the engine remembers dialogue history, thus avoiding repetitive processing of historical sessions.

github

: 6.0k

cf-proxy-ex

Cloudflare Proxy EX is a tool that provides Cloudflare super proxy, OpenAI/ChatGPT proxy, Github acceleration, and online proxy services. It allows users to create a worker in Cloudflare website by copying the content from worker.js file, and add their domain name before any URL to use the tool. The tool is an improvement based on gaboolic's cloudflare-reverse-proxy, offering features like removing '/proxy/', handling redirection events, modifying headers, converting relative paths to absolute paths, and more. It aims to enhance proxy functionality and address issues faced by some websites. However, users are advised not to log in to any website through the online proxy due to potential security risks.

github

: 234

chatgpt.js-chrome-starter

chatgpt.js-chrome-starter is a starting point for developing Chrome extensions using chatgpt.js. It provides a template with installation instructions and tips for creating extensions that leverage the ChatGPT technology. The repository includes sample screenshots and references to advanced Chrome API methods for developers to explore.

github

: 57

llama.cpp

The main goal of llama.cpp is to enable LLM inference with minimal setup and state-of-the-art performance on a wide range of hardware - locally and in the cloud. It provides a Plain C/C++ implementation without any dependencies, optimized for Apple silicon via ARM NEON, Accelerate and Metal frameworks, and supports various architectures like AVX, AVX2, AVX512, and AMX. It offers integer quantization for faster inference, custom CUDA kernels for NVIDIA GPUs, Vulkan and SYCL backend support, and CPU+GPU hybrid inference. llama.cpp is the main playground for developing new features for the ggml library, supporting various models and providing tools and infrastructure for LLM deployment.

github

: 77.7k

VideoRefer

VideoRefer Suite is a tool designed to enhance the fine-grained spatial-temporal understanding capabilities of Video Large Language Models (Video LLMs). It consists of three primary components: Model (VideoRefer) for perceiving, reasoning, and retrieval for user-defined regions at any specified timestamps, Dataset (VideoRefer-700K) for high-quality object-level video instruction data, and Benchmark (VideoRefer-Bench) to evaluate object-level video understanding capabilities. The tool can understand any object within a video.

github

: 157

awesome-cuda-and-hpc

github

: 221

awesome-cuda-triton-hpc

github

: 211

pro-chat

ProChat is a components library focused on quickly building large language model chat interfaces. It empowers developers to create rich, dynamic, and intuitive chat interfaces with features like automatic chat caching, streamlined conversations, message editing tools, auto-rendered Markdown, and programmatic controls. The tool also includes design evolution plans such as customized dialogue rendering, enhanced request parameters, personalized error handling, expanded documentation, and atomic component design.

github

: 514

llama.cpp

llama.cpp is a C++ implementation of LLaMA, a large language model from Meta. It provides a command-line interface for inference and can be used for a variety of tasks, including text generation, translation, and question answering. llama.cpp is highly optimized for performance and can be run on a variety of hardware, including CPUs, GPUs, and TPUs.

github

: 72.0k

awesome-cuda-and-hpc

github

: 129

anx-reader

Anx Reader is a meticulously designed e-book reader tailored for book enthusiasts. It boasts powerful AI functionalities and supports various e-book formats, enhancing the reading experience. With a modern interface, the tool aims to provide a seamless and enjoyable reading journey. It offers rich format support, seamless sync across devices, smart AI assistance, personalized reading experiences, professional reading analytics, a powerful note system, practical tools, and cross-platform support. The tool is continuously evolving with features like UI adaptation for tablets, page-turning animation, TTS voice reading, reading fonts, translation, and more in the pipeline.

github

: 3.4k

For similar tasks

aim

github

: 5.4k

For similar jobs

lollms-webui

LoLLMs WebUI (Lord of Large Language Multimodal Systems: One tool to rule them all) is a user-friendly interface to access and utilize various LLM (Large Language Models) and other AI models for a wide range of tasks. With over 500 AI expert conditionings across diverse domains and more than 2500 fine tuned models over multiple domains, LoLLMs WebUI provides an immediate resource for any problem, from car repair to coding assistance, legal matters, medical diagnosis, entertainment, and more. The easy-to-use UI with light and dark mode options, integration with GitHub repository, support for different personalities, and features like thumb up/down rating, copy, edit, and remove messages, local database storage, search, export, and delete multiple discussions, make LoLLMs WebUI a powerful and versatile tool.

github

: 4.6k

Azure-Analytics-and-AI-Engagement

The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.

github

: 136

minio

MinIO is a High Performance Object Storage released under GNU Affero General Public License v3.0. It is API compatible with Amazon S3 cloud storage service. Use MinIO to build high performance infrastructure for machine learning, analytics and application data workloads.

github

: 46.0k

mage-ai

Mage is an open-source data pipeline tool for transforming and integrating data. It offers an easy developer experience, engineering best practices built-in, and data as a first-class citizen. Mage makes it easy to build, preview, and launch data pipelines, and provides observability and scaling capabilities. It supports data integrations, streaming pipelines, and dbt integration.

github

: 7.8k

AiTreasureBox

AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.

github

: 368

tidb

TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.

github

: 37.1k

airbyte

Airbyte is an open-source data integration platform that makes it easy to move data from any source to any destination. With Airbyte, you can build and manage data pipelines without writing any code. Airbyte provides a library of pre-built connectors that make it easy to connect to popular data sources and destinations. You can also create your own connectors using Airbyte's no-code Connector Builder or low-code CDK. Airbyte is used by data engineers and analysts at companies of all sizes to build and manage their data pipelines.

github

: 17.8k

labelbox-python

Labelbox is a data-centric AI platform for enterprises to develop, optimize, and use AI to solve problems and power new products and services. Enterprises use Labelbox to curate data, generate high-quality human feedback data for computer vision and LLMs, evaluate model performance, and automate tasks by combining AI and human-centric workflows. The academic & research community uses Labelbox for cutting-edge AI research.

github

: 135

aim

README:

An easy-to-use & supercharged open-source experiment tracker

About • Demos • Ecosystem • Quick Start • Examples • Documentation • Community • Blog

ℹ️ About

🎬 Demos

🌍 Ecosystem

🏁 Quick start

1. Install Aim on your training environment

2. Integrate Aim with your code

3. Run the training as usual and start Aim UI

Learn more

🆚 Comparisons to familiar tools

🛣️ Roadmap

Detailed milestones

High-level roadmap

👥 Community

Aim README badge

Cite Aim in your papers

Contributing to Aim

More questions?

For Tasks:

For Jobs:

Alternative AI tools for aim

Similar Open Source Tools

aim

superagentx

kan-gpt

ColossalAI

omi

lmdeploy

cf-proxy-ex

chatgpt.js-chrome-starter

llama.cpp

VideoRefer

awesome-cuda-and-hpc

awesome-cuda-triton-hpc

pro-chat

llama.cpp

awesome-cuda-and-hpc

anx-reader

For similar tasks

aim

For similar jobs

lollms-webui

Azure-Analytics-and-AI-Engagement

minio

mage-ai

AiTreasureBox

tidb

airbyte

labelbox-python