wandb
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
Stars: 9373
Weights & Biases (W&B) is a platform that helps users build better machine learning models faster by tracking and visualizing all components of the machine learning pipeline, from datasets to production models. It offers tools for tracking, debugging, evaluating, and monitoring machine learning applications. W&B provides integrations with popular frameworks like PyTorch, TensorFlow/Keras, Hugging Face Transformers, PyTorch Lightning, XGBoost, and Sci-Kit Learn. Users can easily log metrics, visualize performance, and compare experiments using W&B. The platform also supports hosting options in the cloud or on private infrastructure, making it versatile for various deployment needs.
README:
Use W&B to build better models faster. Track and visualize all the pieces of your machine learning pipeline, from datasets to production machine learning models. Get started with W&B today, sign up for a W&B account!
Building an LLM app? Track, debug, evaluate, and monitor LLM apps with Weave, our new suite of tools for GenAI.
See the W&B Developer Guide and API Reference Guide for a full technical description of the W&B platform.
Get started with W&B in four steps:
-
First, sign up for a W&B account.
-
Second, install the W&B SDK with pip. Navigate to your terminal and type the following command:
pip install wandb
- Third, log into W&B:
wandb.login()
- Use the example code snippet below as a template to integrate W&B to your Python script:
import wandb
# Start a W&B Run with wandb.init
run = wandb.init(project="my_first_project")
# Save model inputs and hyperparameters in a wandb.config object
config = run.config
config.learning_rate = 0.01
# Model training code here ...
# Log metrics over time to visualize performance with wandb.log
for i in range(10):
run.log({"loss": ...})
# Mark the run as finished, and finish uploading all data
run.finish()
That's it! Navigate to the W&B App to view a dashboard of your first W&B Experiment. Use the W&B App to compare multiple experiments in a unified place, dive into the results of a single run, and much more!
Example W&B Dashboard that shows Runs from an Experiment.
Use your favorite framework with W&B. W&B integrations make it fast and easy to set up experiment tracking and data versioning inside existing projects. For more information on how to integrate W&B with the framework of your choice, see the Integrations chapter in the W&B Developer Guide.
🔥 PyTorch
Call .watch
and pass in your PyTorch model to automatically log gradients and store the network topology. Next, use .log
to track other metrics. The following example demonstrates an example of how to do this:
import wandb
# 1. Start a new run
run = wandb.init(project="gpt4")
# 2. Save model inputs and hyperparameters
config = run.config
config.dropout = 0.01
# 3. Log gradients and model parameters
run.watch(model)
for batch_idx, (data, target) in enumerate(train_loader):
...
if batch_idx % args.log_interval == 0:
# 4. Log metrics to visualize performance
run.log({"loss": loss})
- Run an example Google Colab Notebook.
- Read the Developer Guide for technical details on how to integrate PyTorch with W&B.
- Explore W&B Reports.
🌊 TensorFlow/Keras
Use W&B Callbacks to automatically save metrics to W&B when you call `model.fit` during training.The following code example demonstrates how your script might look like when you integrate W&B with Keras:
# This script needs these libraries to be installed:
# tensorflow, numpy
import wandb
from wandb.keras import WandbMetricsLogger, WandbModelCheckpoint
import random
import numpy as np
import tensorflow as tf
# Start a run, tracking hyperparameters
run = wandb.init(
# set the wandb project where this run will be logged
project="my-awesome-project",
# track hyperparameters and run metadata with wandb.config
config={
"layer_1": 512,
"activation_1": "relu",
"dropout": random.uniform(0.01, 0.80),
"layer_2": 10,
"activation_2": "softmax",
"optimizer": "sgd",
"loss": "sparse_categorical_crossentropy",
"metric": "accuracy",
"epoch": 8,
"batch_size": 256,
},
)
# [optional] use wandb.config as your config
config = run.config
# get the data
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
x_train, y_train = x_train[::5], y_train[::5]
x_test, y_test = x_test[::20], y_test[::20]
labels = [str(digit) for digit in range(np.max(y_train) + 1)]
# build a model
model = tf.keras.models.Sequential(
[
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(config.layer_1, activation=config.activation_1),
tf.keras.layers.Dropout(config.dropout),
tf.keras.layers.Dense(config.layer_2, activation=config.activation_2),
]
)
# compile the model
model.compile(optimizer=config.optimizer, loss=config.loss, metrics=[config.metric])
# WandbMetricsLogger will log train and validation metrics to wandb
# WandbModelCheckpoint will upload model checkpoints to wandb
history = model.fit(
x=x_train,
y=y_train,
epochs=config.epoch,
batch_size=config.batch_size,
validation_data=(x_test, y_test),
callbacks=[
WandbMetricsLogger(log_freq=5),
WandbModelCheckpoint("models"),
],
)
# [optional] finish the wandb run, necessary in notebooks
run.finish()
Get started integrating your Keras model with W&B today:
- Run an example Google Colab Notebook
- Read the Developer Guide for technical details on how to integrate Keras with W&B.
- Explore W&B Reports.
🤗 Hugging Face Transformers
Pass wandb
to the report_to
argument when you run a script using a Hugging Face Trainer. W&B will automatically log losses,
evaluation metrics, model topology, and gradients.
Note: The environment you run your script in must have wandb
installed.
The following example demonstrates how to integrate W&B with Hugging Face:
# This script needs these libraries to be installed:
# numpy, transformers, datasets
import wandb
import os
import numpy as np
from datasets import load_dataset
from transformers import TrainingArguments, Trainer
from transformers import AutoTokenizer, AutoModelForSequenceClassification
def tokenize_function(examples):
return tokenizer(examples["text"], padding="max_length", truncation=True)
def compute_metrics(eval_pred):
logits, labels = eval_pred
predictions = np.argmax(logits, axis=-1)
return {"accuracy": np.mean(predictions == labels)}
# download prepare the data
dataset = load_dataset("yelp_review_full")
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
small_train_dataset = dataset["train"].shuffle(seed=42).select(range(1000))
small_eval_dataset = dataset["test"].shuffle(seed=42).select(range(300))
small_train_dataset = small_train_dataset.map(tokenize_function, batched=True)
small_eval_dataset = small_train_dataset.map(tokenize_function, batched=True)
# download the model
model = AutoModelForSequenceClassification.from_pretrained(
"distilbert-base-uncased", num_labels=5
)
# set the wandb project where this run will be logged
os.environ["WANDB_PROJECT"] = "my-awesome-project"
# save your trained model checkpoint to wandb
os.environ["WANDB_LOG_MODEL"] = "true"
# turn off watch to log faster
os.environ["WANDB_WATCH"] = "false"
# pass "wandb" to the `report_to` parameter to turn on wandb logging
training_args = TrainingArguments(
output_dir="models",
report_to="wandb",
logging_steps=5,
per_device_train_batch_size=32,
per_device_eval_batch_size=32,
evaluation_strategy="steps",
eval_steps=20,
max_steps=100,
save_steps=100,
)
# define the trainer and start training
trainer = Trainer(
model=model,
args=training_args,
train_dataset=small_train_dataset,
eval_dataset=small_eval_dataset,
compute_metrics=compute_metrics,
)
trainer.train()
# [optional] finish the wandb run, necessary in notebooks
wandb.finish()
- Run an example Google Colab Notebook.
- Read the Developer Guide for technical details on how to integrate Hugging Face with W&B.
⚡️ PyTorch Lightning
Build scalable, structured, high-performance PyTorch models with Lightning and log them with W&B.
# This script needs these libraries to be installed:
# torch, torchvision, pytorch_lightning
import wandb
import os
from torch import optim, nn, utils
from torchvision.datasets import MNIST
from torchvision.transforms import ToTensor
import pytorch_lightning as pl
from pytorch_lightning.loggers import WandbLogger
class LitAutoEncoder(pl.LightningModule):
def __init__(self, lr=1e-3, inp_size=28, optimizer="Adam"):
super().__init__()
self.encoder = nn.Sequential(
nn.Linear(inp_size * inp_size, 64), nn.ReLU(), nn.Linear(64, 3)
)
self.decoder = nn.Sequential(
nn.Linear(3, 64), nn.ReLU(), nn.Linear(64, inp_size * inp_size)
)
self.lr = lr
# save hyperparameters to self.hparamsm auto-logged by wandb
self.save_hyperparameters()
def training_step(self, batch, batch_idx):
x, y = batch
x = x.view(x.size(0), -1)
z = self.encoder(x)
x_hat = self.decoder(z)
loss = nn.functional.mse_loss(x_hat, x)
# log metrics to wandb
self.log("train_loss", loss)
return loss
def configure_optimizers(self):
optimizer = optim.Adam(self.parameters(), lr=self.lr)
return optimizer
# init the autoencoder
autoencoder = LitAutoEncoder(lr=1e-3, inp_size=28)
# setup data
batch_size = 32
dataset = MNIST(os.getcwd(), download=True, transform=ToTensor())
train_loader = utils.data.DataLoader(dataset, shuffle=True)
# initialise the wandb logger and name your wandb project
wandb_logger = WandbLogger(project="my-awesome-project")
# add your batch size to the wandb config
wandb_logger.experiment.config["batch_size"] = batch_size
# pass wandb_logger to the Trainer
trainer = pl.Trainer(limit_train_batches=750, max_epochs=5, logger=wandb_logger)
# train the model
trainer.fit(model=autoencoder, train_dataloaders=train_loader)
# [optional] finish the wandb run, necessary in notebooks
wandb.finish()
- Run an example Google Colab Notebook.
- Read the Developer Guide for technical details on how to integrate PyTorch Lightning with W&B.
💨 XGBoost
Use W&B Callbacks to automatically save metrics to W&B when you call `model.fit` during training.The following code example demonstrates how your script might look like when you integrate W&B with XGBoost:
# This script needs these libraries to be installed:
# numpy, xgboost
import wandb
from wandb.xgboost import WandbCallback
import numpy as np
import xgboost as xgb
# setup parameters for xgboost
param = {
"objective": "multi:softmax",
"eta": 0.1,
"max_depth": 6,
"nthread": 4,
"num_class": 6,
}
# start a new wandb run to track this script
run = wandb.init(
# set the wandb project where this run will be logged
project="my-awesome-project",
# track hyperparameters and run metadata
config=param,
)
# download data from wandb Artifacts and prep data
run.use_artifact("wandb/intro/dermatology_data:v0", type="dataset").download(".")
data = np.loadtxt(
"./dermatology.data",
delimiter=",",
converters={33: lambda x: int(x == "?"), 34: lambda x: int(x) - 1},
)
sz = data.shape
train = data[: int(sz[0] * 0.7), :]
test = data[int(sz[0] * 0.7) :, :]
train_X = train[:, :33]
train_Y = train[:, 34]
test_X = test[:, :33]
test_Y = test[:, 34]
xg_train = xgb.DMatrix(train_X, label=train_Y)
xg_test = xgb.DMatrix(test_X, label=test_Y)
watchlist = [(xg_train, "train"), (xg_test, "test")]
# add another config to the wandb run
num_round = 5
run.config["num_round"] = 5
run.config["data_shape"] = sz
# pass WandbCallback to the booster to log its configs and metrics
bst = xgb.train(
param, xg_train, num_round, evals=watchlist, callbacks=[WandbCallback()]
)
# get prediction
pred = bst.predict(xg_test)
error_rate = np.sum(pred != test_Y) / test_Y.shape[0]
# log your test metric to wandb
run.summary["Error Rate"] = error_rate
# [optional] finish the wandb run, necessary in notebooks
run.finish()
- Run an example Google Colab Notebook.
- Read the Developer Guide for technical details on how to integrate XGBoost with W&B.
🧮 Sci-Kit Learn
Use wandb to visualize and compare your scikit-learn models' performance:# This script needs these libraries to be installed:
# numpy, sklearn
import wandb
from wandb.sklearn import plot_precision_recall, plot_feature_importances
from wandb.sklearn import plot_class_proportions, plot_learning_curve, plot_roc
import numpy as np
from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
# load and process data
wbcd = datasets.load_breast_cancer()
feature_names = wbcd.feature_names
labels = wbcd.target_names
test_size = 0.2
X_train, X_test, y_train, y_test = train_test_split(
wbcd.data, wbcd.target, test_size=test_size
)
# train model
model = RandomForestClassifier()
model.fit(X_train, y_train)
model_params = model.get_params()
# get predictions
y_pred = model.predict(X_test)
y_probas = model.predict_proba(X_test)
importances = model.feature_importances_
indices = np.argsort(importances)[::-1]
# start a new wandb run and add your model hyperparameters
run = wandb.init(project="my-awesome-project", config=model_params)
# Add additional configs to wandb
run.config.update(
{
"test_size": test_size,
"train_len": len(X_train),
"test_len": len(X_test),
}
)
# log additional visualisations to wandb
plot_class_proportions(y_train, y_test, labels)
plot_learning_curve(model, X_train, y_train)
plot_roc(y_test, y_probas, labels)
plot_precision_recall(y_test, y_probas, labels)
plot_feature_importances(model)
# [optional] finish the wandb run, necessary in notebooks
run.finish()
- Run an example Google Colab Notebook.
- Read the Developer Guide for technical details on how to integrate Scikit-Learn with W&B.
Weights & Biases is available in the cloud or installed on your private infrastructure. Set up a W&B Server in a production environment in one of three ways:
- Production Cloud: Set up a production deployment on a private cloud in just a few steps using terraform scripts provided by W&B.
- Dedicated Cloud: A managed, dedicated deployment on W&B's single-tenant infrastructure in your choice of cloud region.
-
On-Prem/Bare Metal: W&B supports setting up a production server on most bare metal servers in your on-premise data centers. Quickly get started by running
wandb server
to easily start hosting W&B on your local infrastructure.
See the Hosting documentation in the W&B Developer Guide for more information.
We are committed to supporting our minimum required Python version for at least six months after its official end-of-life (EOL) date, as defined by the Python Software Foundation. You can find a list of Python EOL dates here.
When we discontinue support for a Python version, we will increment the library’s minor version number to reflect this change.
Weights & Biases ❤️ open source, and we welcome contributions from the community! See the Contribution guide for more information on the development workflow and the internals of the wandb library. For wandb bugs and feature requests, visit GitHub Issues or contact [email protected].
Be a part of the growing W&B Community and interact with the W&B team in our Discord. Stay connected with the latest ML updates and tutorials with W&B Fully Connected.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for wandb
Similar Open Source Tools
wandb
Weights & Biases (W&B) is a platform that helps users build better machine learning models faster by tracking and visualizing all components of the machine learning pipeline, from datasets to production models. It offers tools for tracking, debugging, evaluating, and monitoring machine learning applications. W&B provides integrations with popular frameworks like PyTorch, TensorFlow/Keras, Hugging Face Transformers, PyTorch Lightning, XGBoost, and Sci-Kit Learn. Users can easily log metrics, visualize performance, and compare experiments using W&B. The platform also supports hosting options in the cloud or on private infrastructure, making it versatile for various deployment needs.
continuous-eval
Open-Source Evaluation for LLM Applications. `continuous-eval` is an open-source package created for granular and holistic evaluation of GenAI application pipelines. It offers modularized evaluation, a comprehensive metric library covering various LLM use cases, the ability to leverage user feedback in evaluation, and synthetic dataset generation for testing pipelines. Users can define their own metrics by extending the Metric class. The tool allows running evaluation on a pipeline defined with modules and corresponding metrics. Additionally, it provides synthetic data generation capabilities to create user interaction data for evaluation or training purposes.
zeta
Zeta is a tool designed to build state-of-the-art AI models faster by providing modular, high-performance, and scalable building blocks. It addresses the common issues faced while working with neural nets, such as chaotic codebases, lack of modularity, and low performance modules. Zeta emphasizes usability, modularity, and performance, and is currently used in hundreds of models across various GitHub repositories. It enables users to prototype, train, optimize, and deploy the latest SOTA neural nets into production. The tool offers various modules like FlashAttention, SwiGLUStacked, RelativePositionBias, FeedForward, BitLinear, PalmE, Unet, VisionEmbeddings, niva, FusedDenseGELUDense, FusedDropoutLayerNorm, MambaBlock, Film, hyper_optimize, DPO, and ZetaCloud for different tasks in AI model development.
beyondllm
Beyond LLM offers an all-in-one toolkit for experimentation, evaluation, and deployment of Retrieval-Augmented Generation (RAG) systems. It simplifies the process with automated integration, customizable evaluation metrics, and support for various Large Language Models (LLMs) tailored to specific needs. The aim is to reduce LLM hallucination risks and enhance reliability.
Ollama
Ollama SDK for .NET is a fully generated C# SDK based on OpenAPI specification using OpenApiGenerator. It supports automatic releases of new preview versions, source generator for defining tools natively through C# interfaces, and all modern .NET features. The SDK provides support for all Ollama API endpoints including chats, embeddings, listing models, pulling and creating new models, and more. It also offers tools for interacting with weather data and providing weather-related information to users.
rl
TorchRL is an open-source Reinforcement Learning (RL) library for PyTorch. It provides pytorch and **python-first** , low and high level abstractions for RL that are intended to be **efficient** , **modular** , **documented** and properly **tested**. The code is aimed at supporting research in RL. Most of it is written in python in a highly modular way, such that researchers can easily swap components, transform them or write new ones with little effort.
RWKV-LM
RWKV is an RNN with Transformer-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). And it's 100% attention-free. You only need the hidden state at position t to compute the state at position t+1. You can use the "GPT" mode to quickly compute the hidden state for the "RNN" mode. So it's combining the best of RNN and transformer - **great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding** (using the final hidden state).
dynamiq
Dynamiq is an orchestration framework designed to streamline the development of AI-powered applications, specializing in orchestrating retrieval-augmented generation (RAG) and large language model (LLM) agents. It provides an all-in-one Gen AI framework for agentic AI and LLM applications, offering tools for multi-agent orchestration, document indexing, and retrieval flows. With Dynamiq, users can easily build and deploy AI solutions for various tasks.
lancedb
LanceDB is an open-source database for vector-search built with persistent storage, which greatly simplifies retrieval, filtering, and management of embeddings. The key features of LanceDB include: Production-scale vector search with no servers to manage. Store, query, and filter vectors, metadata, and multi-modal data (text, images, videos, point clouds, and more). Support for vector similarity search, full-text search, and SQL. Native Python and Javascript/Typescript support. Zero-copy, automatic versioning, manage versions of your data without needing extra infrastructure. GPU support in building vector index(*). Ecosystem integrations with LangChain 🦜️🔗, LlamaIndex 🦙, Apache-Arrow, Pandas, Polars, DuckDB, and more on the way. LanceDB's core is written in Rust 🦀 and is built using Lance, an open-source columnar format designed for performant ML workloads.
pywhy-llm
PyWhy-LLM is an innovative library that integrates Large Language Models (LLMs) into the causal analysis process, empowering users with knowledge previously only available through domain experts. It seamlessly augments existing causal inference processes by suggesting potential confounders, relationships between variables, backdoor sets, front door sets, IV sets, estimands, critiques of DAGs, latent confounders, and negative controls. By leveraging LLMs and formalizing human-LLM collaboration, PyWhy-LLM aims to enhance causal analysis accessibility and insight.
create-million-parameter-llm-from-scratch
The 'create-million-parameter-llm-from-scratch' repository provides a detailed guide on creating a Large Language Model (LLM) with 2.3 million parameters from scratch. The blog replicates the LLaMA approach, incorporating concepts like RMSNorm for pre-normalization, SwiGLU activation function, and Rotary Embeddings. The model is trained on a basic dataset to demonstrate the ease of creating a million-parameter LLM without the need for a high-end GPU.
langchaingo
LangChain Go is a Go language implementation of LangChain, a framework for building applications with LLMs through composability. It provides a simple and easy-to-use API for interacting with LLMs, making it easy to add language-based features to your applications.
composio
Composio is a production-ready toolset for AI agents that enables users to integrate AI agents with various agentic tools effortlessly. It provides support for over 100 tools across different categories, including popular softwares like GitHub, Notion, Linear, Gmail, Slack, and more. Composio ensures managed authorization with support for six different authentication protocols, offering better agentic accuracy and ease of use. Users can easily extend Composio with additional tools, frameworks, and authorization protocols. The toolset is designed to be embeddable and pluggable, allowing for seamless integration and consistent user experience.
vocode-core
Vocode is an open source library that enables users to build voice-based LLM (Large Language Model) applications quickly and easily. With Vocode, users can create real-time streaming conversations with LLMs and deploy them for phone calls, Zoom meetings, and more. The library offers abstractions and integrations for transcription services, LLMs, and synthesis services, making it a comprehensive tool for voice-based app development. Vocode also provides out-of-the-box integrations with various services like AssemblyAI, OpenAI, Microsoft Azure, and more, allowing users to leverage these services seamlessly in their applications.
gritlm
The 'gritlm' repository provides all materials for the paper Generative Representational Instruction Tuning. It includes code for inference, training, evaluation, and known issues related to the GritLM model. The repository also offers models for embedding and generation tasks, along with instructions on how to train and evaluate the models. Additionally, it contains visualizations, acknowledgements, and a citation for referencing the work.
mlflow
MLflow is a platform to streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models. MLflow offers a set of lightweight APIs that can be used with any existing machine learning application or library (TensorFlow, PyTorch, XGBoost, etc), wherever you currently run ML code (e.g. in notebooks, standalone applications or the cloud). MLflow's current components are:
* `MLflow Tracking
For similar tasks
wandb
Weights & Biases (W&B) is a platform that helps users build better machine learning models faster by tracking and visualizing all components of the machine learning pipeline, from datasets to production models. It offers tools for tracking, debugging, evaluating, and monitoring machine learning applications. W&B provides integrations with popular frameworks like PyTorch, TensorFlow/Keras, Hugging Face Transformers, PyTorch Lightning, XGBoost, and Sci-Kit Learn. Users can easily log metrics, visualize performance, and compare experiments using W&B. The platform also supports hosting options in the cloud or on private infrastructure, making it versatile for various deployment needs.
ml-engineering
This repository provides a comprehensive collection of methodologies, tools, and step-by-step instructions for successful training of large language models (LLMs) and multi-modal models. It is a technical resource suitable for LLM/VLM training engineers and operators, containing numerous scripts and copy-n-paste commands to facilitate quick problem-solving. The repository is an ongoing compilation of the author's experiences training BLOOM-176B and IDEFICS-80B models, and currently focuses on the development and training of Retrieval Augmented Generation (RAG) models at Contextual.AI. The content is organized into six parts: Insights, Hardware, Orchestration, Training, Development, and Miscellaneous. It includes key comparison tables for high-end accelerators and networks, as well as shortcuts to frequently needed tools and guides. The repository is open to contributions and discussions, and is licensed under Attribution-ShareAlike 4.0 International.
Wandb.jl
Unofficial Julia Bindings for wandb.ai. Wandb is a platform for tracking and visualizing machine learning experiments. It provides a simple and consistent way to log metrics, parameters, and other data from your experiments, and to visualize them in a variety of ways. Wandb.jl provides a convenient way to use Wandb from Julia.
qlora-pipe
qlora-pipe is a pipeline parallel training script designed for efficiently training large language models that cannot fit on one GPU. It supports QLoRA, LoRA, and full fine-tuning, with efficient model loading and the ability to load any dataset that Axolotl can handle. The script allows for raw text training, resuming training from a checkpoint, logging metrics to Tensorboard, specifying a separate evaluation dataset, training on multiple datasets simultaneously, and supports various models like Llama, Mistral, Mixtral, Qwen-1.5, and Cohere (Command R). It handles pipeline- and data-parallelism using Deepspeed, enabling users to set the number of GPUs, pipeline stages, and gradient accumulation steps for optimal utilization.
metaflow
Metaflow is a user-friendly library designed to assist scientists and engineers in developing and managing real-world data science projects. Initially created at Netflix, Metaflow aimed to enhance the productivity of data scientists working on diverse projects ranging from traditional statistics to cutting-edge deep learning. For further information, refer to Metaflow's website and documentation.
mlflow
MLflow is a platform to streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models. MLflow offers a set of lightweight APIs that can be used with any existing machine learning application or library (TensorFlow, PyTorch, XGBoost, etc), wherever you currently run ML code (e.g. in notebooks, standalone applications or the cloud). MLflow's current components are:
* `MLflow Tracking
fasttrackml
FastTrackML is an experiment tracking server focused on speed and scalability, fully compatible with MLFlow. It provides a user-friendly interface to track and visualize your machine learning experiments, making it easy to compare different models and identify the best performing ones. FastTrackML is open source and can be easily installed and run with pip or Docker. It is also compatible with the MLFlow Python package, making it easy to integrate with your existing MLFlow workflows.
zenml
ZenML is an extensible, open-source MLOps framework for creating portable, production-ready machine learning pipelines. By decoupling infrastructure from code, ZenML enables developers across your organization to collaborate more effectively as they develop to production.
For similar jobs
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.
spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.
Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.