
crush
The glamourous AI coding agent for your favourite terminal 💘
Stars: 13214

Crush is a versatile tool designed to enhance coding workflows in your terminal. It offers support for multiple LLMs, allows for flexible switching between models, and enables session-based work management. Crush is extensible through MCPs and works across various operating systems. It can be installed using package managers like Homebrew and NPM, or downloaded directly. Crush supports various APIs like Anthropic, OpenAI, Groq, and Google Gemini, and allows for customization through environment variables. The tool can be configured locally or globally, and supports LSPs for additional context. Crush also provides options for ignoring files, allowing tools, and configuring local models. It respects `.gitignore` files and offers logging capabilities for troubleshooting and debugging.
README:
Your new coding bestie, now available in your favourite terminal.
Your tools, your code, and your workflows, wired into your LLM of choice.
你的新编程伙伴,现在就在你最爱的终端中。
你的工具、代码和工作流,都与您选择的 LLM 模型紧密相连。
- Multi-Model: choose from a wide range of LLMs or add your own via OpenAI- or Anthropic-compatible APIs
- Flexible: switch LLMs mid-session while preserving context
- Session-Based: maintain multiple work sessions and contexts per project
- LSP-Enhanced: Crush uses LSPs for additional context, just like you do
-
Extensible: add capabilities via MCPs (
http
,stdio
, andsse
) - Works Everywhere: first-class support in every terminal on macOS, Linux, Windows (PowerShell and WSL), FreeBSD, OpenBSD, and NetBSD
Use a package manager:
# Homebrew
brew install charmbracelet/tap/crush
# NPM
npm install -g @charmland/crush
# Arch Linux (btw)
yay -S crush-bin
# Nix
nix run github:numtide/nix-ai-tools#crush
Windows users:
# Winget
winget install charmbracelet.crush
# Scoop
scoop bucket add charm https://github.com/charmbracelet/scoop-bucket.git
scoop install crush
Nix (NUR)
Crush is available via NUR in nur.repos.charmbracelet.crush
.
You can also try out Crush via nix-shell
:
# Add the NUR channel.
nix-channel --add https://github.com/nix-community/NUR/archive/main.tar.gz nur
nix-channel --update
# Get Crush in a Nix shell.
nix-shell -p '(import <nur> { pkgs = import <nixpkgs> {}; }).repos.charmbracelet.crush'
Crush provides NixOS and Home Manager modules via NUR. You can use these modules directly in your flake by importing them from NUR. Since it auto detects whether its a home manager or nixos context you can use the import the exact same way :)
{
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
nur.url = "github:nix-community/NUR";
};
outputs = { self, nixpkgs, nur, ... }: {
nixosConfigurations.your-hostname = nixpkgs.lib.nixosSystem {
system = "x86_64-linux";
modules = [
nur.modules.nixos.default
nur.repos.charmbracelet.modules.crush
{
programs.crush = {
enable = true;
settings = {
providers = {
openai = {
id = "openai";
name = "OpenAI";
base_url = "https://api.openai.com/v1";
type = "openai";
api_key = "sk-fake123456789abcdef...";
models = [
{
id = "gpt-4";
name = "GPT-4";
}
];
};
};
lsp = {
go = { command = "gopls"; enabled = true; };
nix = { command = "nil"; enabled = true; };
};
options = {
context_paths = [ "/etc/nixos/configuration.nix" ];
tui = { compact_mode = true; };
debug = false;
};
};
};
}
];
};
};
}
Debian/Ubuntu
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://repo.charm.sh/apt/gpg.key | sudo gpg --dearmor -o /etc/apt/keyrings/charm.gpg
echo "deb [signed-by=/etc/apt/keyrings/charm.gpg] https://repo.charm.sh/apt/ * *" | sudo tee /etc/apt/sources.list.d/charm.list
sudo apt update && sudo apt install crush
Fedora/RHEL
echo '[charm]
name=Charm
baseurl=https://repo.charm.sh/yum/
enabled=1
gpgcheck=1
gpgkey=https://repo.charm.sh/yum/gpg.key' | sudo tee /etc/yum.repos.d/charm.repo
sudo yum install crush
Or, download it:
- Packages are available in Debian and RPM formats
- Binaries are available for Linux, macOS, Windows, FreeBSD, OpenBSD, and NetBSD
Or just install it with Go:
go install github.com/charmbracelet/crush@latest
[!WARNING] Productivity may increase when using Crush and you may find yourself nerd sniped when first using the application. If the symptoms persist, join the Discord and nerd snipe the rest of us.
The quickest way to get started is to grab an API key for your preferred provider such as Anthropic, OpenAI, Groq, or OpenRouter and just start Crush. You'll be prompted to enter your API key.
That said, you can also set environment variables for preferred providers.
Environment Variable | Provider |
---|---|
ANTHROPIC_API_KEY |
Anthropic |
OPENAI_API_KEY |
OpenAI |
OPENROUTER_API_KEY |
OpenRouter |
GEMINI_API_KEY |
Google Gemini |
CEREBRAS_API_KEY |
Cerebras |
HF_TOKEN |
Huggingface Inference |
VERTEXAI_PROJECT |
Google Cloud VertexAI (Gemini) |
VERTEXAI_LOCATION |
Google Cloud VertexAI (Gemini) |
GROQ_API_KEY |
Groq |
AWS_ACCESS_KEY_ID |
AWS Bedrock (Claude) |
AWS_SECRET_ACCESS_KEY |
AWS Bedrock (Claude) |
AWS_REGION |
AWS Bedrock (Claude) |
AWS_PROFILE |
Custom AWS Profile |
AWS_REGION |
AWS Region |
AZURE_OPENAI_API_ENDPOINT |
Azure OpenAI models |
AZURE_OPENAI_API_KEY |
Azure OpenAI models (optional when using Entra ID) |
AZURE_OPENAI_API_VERSION |
Azure OpenAI models |
Is there a provider you’d like to see in Crush? Is there an existing model that needs an update?
Crush’s default model listing is managed in Catwalk, a community-supported, open source repository of Crush-compatible models, and you’re welcome to contribute.
Crush runs great with no configuration. That said, if you do need or want to customize Crush, configuration can be added either local to the project itself, or globally, with the following priority:
.crush.json
crush.json
-
$HOME/.config/crush/crush.json
(Windows:%USERPROFILE%\AppData\Local\crush\crush.json
)
Configuration itself is stored as a JSON object:
{
"this-setting": { "this": "that" },
"that-setting": ["ceci", "cela"]
}
As an additional note, Crush also stores ephemeral data, such as application state, in one additional location:
# Unix
$HOME/.local/share/crush/crush.json
# Windows
%LOCALAPPDATA%\crush\crush.json
Crush can use LSPs for additional context to help inform its decisions, just like you would. LSPs can be added manually like so:
{
"$schema": "https://charm.land/crush.json",
"lsp": {
"go": {
"command": "gopls",
"env": {
"GOTOOLCHAIN": "go1.24.5"
}
},
"typescript": {
"command": "typescript-language-server",
"args": ["--stdio"]
},
"nix": {
"command": "nil"
}
}
}
Crush also supports Model Context Protocol (MCP) servers through three
transport types: stdio
for command-line servers, http
for HTTP endpoints,
and sse
for Server-Sent Events. Environment variable expansion is supported
using $(echo $VAR)
syntax.
{
"$schema": "https://charm.land/crush.json",
"mcp": {
"filesystem": {
"type": "stdio",
"command": "node",
"args": ["/path/to/mcp-server.js"],
"timeout": 120,
"disabled": false,
"env": {
"NODE_ENV": "production"
}
},
"github": {
"type": "http",
"url": "https://example.com/mcp/",
"timeout": 120,
"disabled": false,
"headers": {
"Authorization": "$(echo Bearer $EXAMPLE_MCP_TOKEN)"
}
},
"streaming-service": {
"type": "sse",
"url": "https://example.com/mcp/sse",
"timeout": 120,
"disabled": false,
"headers": {
"API-Key": "$(echo $API_KEY)"
}
}
}
}
Crush respects .gitignore
files by default, but you can also create a
.crushignore
file to specify additional files and directories that Crush
should ignore. This is useful for excluding files that you want in version
control but don't want Crush to consider when providing context.
The .crushignore
file uses the same syntax as .gitignore
and can be placed
in the root of your project or in subdirectories.
By default, Crush will ask you for permission before running tool calls. If you'd like, you can allow tools to be executed without prompting you for permissions. Use this with care.
{
"$schema": "https://charm.land/crush.json",
"permissions": {
"allowed_tools": [
"view",
"ls",
"grep",
"edit",
"mcp_context7_get-library-doc"
]
}
}
You can also skip all permission prompts entirely by running Crush with the
--yolo
flag. Be very, very careful with this feature.
By default, Crush adds attribution information to Git commits and pull requests
it creates. You can customize this behavior with the attribution
option:
{
"$schema": "https://charm.land/crush.json",
"options": {
"attribution": {
"co_authored_by": true,
"generated_with": true
}
}
}
-
co_authored_by
: When true (default), addsCo-Authored-By: Crush <[email protected]>
to commit messages -
generated_with
: When true (default), adds💘 Generated with Crush
line to commit messages and PR descriptions
Local models can also be configured via OpenAI-compatible API. Here are two common examples:
{
"providers": {
"ollama": {
"name": "Ollama",
"base_url": "http://localhost:11434/v1/",
"type": "openai",
"models": [
{
"name": "Qwen 3 30B",
"id": "qwen3:30b",
"context_window": 256000,
"default_max_tokens": 20000
}
]
}
}
}
{
"providers": {
"lmstudio": {
"name": "LM Studio",
"base_url": "http://localhost:1234/v1/",
"type": "openai",
"models": [
{
"name": "Qwen 3 30B",
"id": "qwen/qwen3-30b-a3b-2507",
"context_window": 256000,
"default_max_tokens": 20000
}
]
}
}
}
Crush supports custom provider configurations for both OpenAI-compatible and Anthropic-compatible APIs.
Here’s an example configuration for Deepseek, which uses an OpenAI-compatible
API. Don't forget to set DEEPSEEK_API_KEY
in your environment.
{
"$schema": "https://charm.land/crush.json",
"providers": {
"deepseek": {
"type": "openai",
"base_url": "https://api.deepseek.com/v1",
"api_key": "$DEEPSEEK_API_KEY",
"models": [
{
"id": "deepseek-chat",
"name": "Deepseek V3",
"cost_per_1m_in": 0.27,
"cost_per_1m_out": 1.1,
"cost_per_1m_in_cached": 0.07,
"cost_per_1m_out_cached": 1.1,
"context_window": 64000,
"default_max_tokens": 5000
}
]
}
}
}
Custom Anthropic-compatible providers follow this format:
{
"$schema": "https://charm.land/crush.json",
"providers": {
"custom-anthropic": {
"type": "anthropic",
"base_url": "https://api.anthropic.com/v1",
"api_key": "$ANTHROPIC_API_KEY",
"extra_headers": {
"anthropic-version": "2023-06-01"
},
"models": [
{
"id": "claude-sonnet-4-20250514",
"name": "Claude Sonnet 4",
"cost_per_1m_in": 3,
"cost_per_1m_out": 15,
"cost_per_1m_in_cached": 3.75,
"cost_per_1m_out_cached": 0.3,
"context_window": 200000,
"default_max_tokens": 50000,
"can_reason": true,
"supports_attachments": true
}
]
}
}
}
Crush currently supports running Anthropic models through Bedrock, with caching disabled.
- A Bedrock provider will appear once you have AWS configured, i.e.
aws configure
- Crush also expects the
AWS_REGION
orAWS_DEFAULT_REGION
to be set - To use a specific AWS profile set
AWS_PROFILE
in your environment, i.e.AWS_PROFILE=myprofile crush
Vertex AI will appear in the list of available providers when VERTEXAI_PROJECT
and VERTEXAI_LOCATION
are set. You will also need to be authenticated:
gcloud auth application-default login
To add specific models to the configuration, configure as such:
{
"$schema": "https://charm.land/crush.json",
"providers": {
"vertexai": {
"models": [
{
"id": "claude-sonnet-4@20250514",
"name": "VertexAI Sonnet 4",
"cost_per_1m_in": 3,
"cost_per_1m_out": 15,
"cost_per_1m_in_cached": 3.75,
"cost_per_1m_out_cached": 0.3,
"context_window": 200000,
"default_max_tokens": 50000,
"can_reason": true,
"supports_attachments": true
}
]
}
}
}
Sometimes you need to look at logs. Luckily, Crush logs all sorts of
stuff. Logs are stored in ./.crush/logs/crush.log
relative to the project.
The CLI also contains some helper commands to make perusing recent logs easier:
# Print the last 1000 lines
crush logs
# Print the last 500 lines
crush logs --tail 500
# Follow logs in real time
crush logs --follow
Want more logging? Run crush
with the --debug
flag, or enable it in the
config:
{
"$schema": "https://charm.land/crush.json",
"options": {
"debug": true,
"debug_lsp": true
}
}
By default, Crush automatically checks for the latest and greatest list of providers and models from Catwalk, the open source Crush provider database. This means that when new providers and models are available, or when model metadata changes, Crush automatically updates your local configuration.
For those with restricted internet access, or those who prefer to work in air-gapped environments, this might not be want you want, and this feature can be disabled.
To disable automatic provider updates, set disable_provider_auto_update
into
your crush.json
config:
{
"$schema": "https://charm.land/crush.json",
"options": {
"disable_provider_auto_update": true
}
}
Or set the CRUSH_DISABLE_PROVIDER_AUTO_UPDATE
environment variable:
export CRUSH_DISABLE_PROVIDER_AUTO_UPDATE=1
Manually updating providers is possible with the crush update-providers
command:
# Update providers remotely from Catwalk.
crush update-providers
# Update providers from a custom Catwalk base URL.
crush update-providers https://example.com/
# Update providers from a local file.
crush update-providers /path/to/local-providers.json
# Reset providers to the embedded version, embedded at crush at build time.
crush update-providers embedded
# For more info:
crush update-providers --help
Crush records pseudonymous usage metrics (tied to a device-specific hash), which maintainers rely on to inform development and support priorities. The metrics include solely usage metadata; prompts and responses are NEVER collected.
Details on exactly what’s collected are in the source code (here and here).
You can opt out of metrics collection at any time by setting the environment variable by setting the following in your environment:
export CRUSH_DISABLE_METRICS=1
Or by setting the following in your config:
{
"options": {
"disable_metrics": true
}
}
Crush also respects the DO_NOT_TRACK
convention which can be enabled via export DO_NOT_TRACK=1
.
Crush only supports model providers through official, compliant APIs. We do not support or endorse any methods that rely on personal Claude Max and GitHub Copilot accounts or OAuth workarounds, which violate Anthropic and Microsoft’s Terms of Service.
We’re committed to building sustainable, trusted integrations with model providers. If you’re a provider interested in working with us, reach out.
See the contributing guide.
We’d love to hear your thoughts on this project. Need help? We gotchu. You can find us on:
Part of Charm.
Charm热爱开源 • Charm loves open source
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for crush
Similar Open Source Tools

crush
Crush is a versatile tool designed to enhance coding workflows in your terminal. It offers support for multiple LLMs, allows for flexible switching between models, and enables session-based work management. Crush is extensible through MCPs and works across various operating systems. It can be installed using package managers like Homebrew and NPM, or downloaded directly. Crush supports various APIs like Anthropic, OpenAI, Groq, and Google Gemini, and allows for customization through environment variables. The tool can be configured locally or globally, and supports LSPs for additional context. Crush also provides options for ignoring files, allowing tools, and configuring local models. It respects `.gitignore` files and offers logging capabilities for troubleshooting and debugging.

ruby-openai
Use the OpenAI API with Ruby! 🤖🩵 Stream text with GPT-4, transcribe and translate audio with Whisper, or create images with DALL·E... Hire me | 🎮 Ruby AI Builders Discord | 🐦 Twitter | 🧠 Anthropic Gem | 🚂 Midjourney Gem ## Table of Contents * Ruby OpenAI * Table of Contents * Installation * Bundler * Gem install * Usage * Quickstart * With Config * Custom timeout or base URI * Extra Headers per Client * Logging * Errors * Faraday middleware * Azure * Ollama * Counting Tokens * Models * Examples * Chat * Streaming Chat * Vision * JSON Mode * Functions * Edits * Embeddings * Batches * Files * Finetunes * Assistants * Threads and Messages * Runs * Runs involving function tools * Image Generation * DALL·E 2 * DALL·E 3 * Image Edit * Image Variations * Moderations * Whisper * Translate * Transcribe * Speech * Errors * Development * Release * Contributing * License * Code of Conduct

context7
Context7 is a powerful tool for analyzing and visualizing data in various formats. It provides a user-friendly interface for exploring datasets, generating insights, and creating interactive visualizations. With advanced features such as data filtering, aggregation, and customization, Context7 is suitable for both beginners and experienced data analysts. The tool supports a wide range of data sources and formats, making it versatile for different use cases. Whether you are working on exploratory data analysis, data visualization, or data storytelling, Context7 can help you uncover valuable insights and communicate your findings effectively.

scylla
Scylla is an intelligent proxy pool tool designed for humanities, enabling users to extract content from the internet and build their own Large Language Models in the AI era. It features automatic proxy IP crawling and validation, an easy-to-use JSON API, a simple web-based user interface, HTTP forward proxy server, Scrapy and requests integration, and headless browser crawling. Users can start using Scylla with just one command, making it a versatile tool for various web scraping and content extraction tasks.

promptic
Promptic is a tool designed for LLM app development, providing a productive and pythonic way to build LLM applications. It leverages LiteLLM, allowing flexibility to switch LLM providers easily. Promptic focuses on building features by providing type-safe structured outputs, easy-to-build agents, streaming support, automatic prompt caching, and built-in conversation memory.

firecrawl-mcp-server
Firecrawl MCP Server is a Model Context Protocol (MCP) server implementation that integrates with Firecrawl for web scraping capabilities. It supports features like scrape, crawl, search, extract, and batch scrape. It provides web scraping with JS rendering, URL discovery, web search with content extraction, automatic retries with exponential backoff, credit usage monitoring, comprehensive logging system, support for cloud and self-hosted FireCrawl instances, mobile/desktop viewport support, and smart content filtering with tag inclusion/exclusion. The server includes configurable parameters for retry behavior and credit usage monitoring, rate limiting and batch processing capabilities, and tools for scraping, batch scraping, checking batch status, searching, crawling, and extracting structured information from web pages.

aiavatarkit
AIAvatarKit is a tool for building AI-based conversational avatars quickly. It supports various platforms like VRChat and cluster, along with real-world devices. The tool is extensible, allowing unlimited capabilities based on user needs. It requires VOICEVOX API, Google or Azure Speech Services API keys, and Python 3.10. Users can start conversations out of the box and enjoy seamless interactions with the avatars.

shell-ai
Shell-AI (`shai`) is a CLI utility that enables users to input commands in natural language and receive single-line command suggestions. It leverages natural language understanding and interactive CLI tools to enhance command line interactions. Users can describe tasks in plain English and receive corresponding command suggestions, making it easier to execute commands efficiently. Shell-AI supports cross-platform usage and is compatible with Azure OpenAI deployments, offering a user-friendly and efficient way to interact with the command line.

firecrawl-mcp-server
Firecrawl MCP Server is a Model Context Protocol (MCP) server implementation that integrates with Firecrawl for web scraping capabilities. It offers features such as web scraping, crawling, and discovery, search and content extraction, deep research and batch scraping, automatic retries and rate limiting, cloud and self-hosted support, and SSE support. The server can be configured to run with various tools like Cursor, Windsurf, SSE Local Mode, Smithery, and VS Code. It supports environment variables for cloud API and optional configurations for retry settings and credit usage monitoring. The server includes tools for scraping, batch scraping, mapping, searching, crawling, and extracting structured data from web pages. It provides detailed logging and error handling functionalities for robust performance.

mcp
Semgrep MCP Server is a beta server under active development for using Semgrep to scan code for security vulnerabilities. It provides a Model Context Protocol (MCP) for various coding tools to get specialized help in tasks. Users can connect to Semgrep AppSec Platform, scan code for vulnerabilities, customize Semgrep rules, analyze and filter scan results, and compare results. The tool is published on PyPI as semgrep-mcp and can be installed using pip, pipx, uv, poetry, or other methods. It supports CLI and Docker environments for running the server. Integration with VS Code is also available for quick installation. The project welcomes contributions and is inspired by core technologies like Semgrep and MCP, as well as related community projects and tools.

langcorn
LangCorn is an API server that enables you to serve LangChain models and pipelines with ease, leveraging the power of FastAPI for a robust and efficient experience. It offers features such as easy deployment of LangChain models and pipelines, ready-to-use authentication functionality, high-performance FastAPI framework for serving requests, scalability and robustness for language processing applications, support for custom pipelines and processing, well-documented RESTful API endpoints, and asynchronous processing for faster response times.

vim-ai
vim-ai is a plugin that adds Artificial Intelligence (AI) capabilities to Vim and Neovim. It allows users to generate code, edit text, and have interactive conversations with GPT models powered by OpenAI's API. The plugin uses OpenAI's API to generate responses, requiring users to set up an account and obtain an API key. It supports various commands for text generation, editing, and chat interactions, providing a seamless integration of AI features into the Vim text editor environment.

llmproxy
llmproxy is a reverse proxy for LLM API based on Cloudflare Worker, supporting platforms like OpenAI, Gemini, and Groq. The interface is compatible with the OpenAI API specification and can be directly accessed using the OpenAI SDK. It provides a convenient way to interact with various AI platforms through a unified API endpoint, enabling seamless integration and usage in different applications.

Gmail-MCP-Server
Gmail AutoAuth MCP Server is a Model Context Protocol (MCP) server designed for Gmail integration in Claude Desktop. It supports auto authentication and enables AI assistants to manage Gmail through natural language interactions. The server provides comprehensive features for sending emails, reading messages, managing labels, searching emails, and batch operations. It offers full support for international characters, email attachments, and Gmail API integration. Users can install and authenticate the server via Smithery or manually with Google Cloud Project credentials. The server supports both Desktop and Web application credentials, with global credential storage for convenience. It also includes Docker support and instructions for cloud server authentication.

jupyter-mcp-server
Jupyter MCP Server is a Model Context Protocol (MCP) server implementation that enables real-time interaction with Jupyter Notebooks. It allows AI to edit, document, and execute code for data analysis and visualization. The server offers features like real-time control, smart execution, and MCP compatibility. Users can use tools such as insert_execute_code_cell, append_markdown_cell, get_notebook_info, and read_cell for advanced interactions with Jupyter notebooks.

jambo
Jambo is a Python package that automatically converts JSON Schema definitions into Pydantic models. It streamlines schema validation and enforces type safety using Pydantic's validation features. The tool supports various JSON Schema features like strings, integers, floats, booleans, arrays, nested objects, and more. It enforces constraints such as minLength, maxLength, pattern, minimum, maximum, uniqueItems, and provides a zero-config approach for generating models. Jambo is designed to simplify the process of dynamically generating Pydantic models for AI frameworks.
For similar tasks

llama-coder
Llama Coder is a self-hosted Github Copilot replacement for VS Code that provides autocomplete using Ollama and Codellama. It works best with Mac M1/M2/M3 or RTX 4090, offering features like fast performance, no telemetry or tracking, and compatibility with any coding language. Users can install Ollama locally or on a dedicated machine for remote usage. The tool supports different models like stable-code and codellama with varying RAM/VRAM requirements, allowing users to optimize performance based on their hardware. Troubleshooting tips and a changelog are also provided for user convenience.

pearai-app
PearAI is an AI-powered code editor designed to enhance development by reducing the amount of coding required. It is a fork of VSCode and the main functionality lies within the 'extension/pearai' submodule. Users can contribute to the project by fixing issues, submitting bugs and feature requests, reviewing source code changes, and improving documentation. The tool aims to streamline the coding process and provide an efficient environment for developers to work in.

crush
Crush is a versatile tool designed to enhance coding workflows in your terminal. It offers support for multiple LLMs, allows for flexible switching between models, and enables session-based work management. Crush is extensible through MCPs and works across various operating systems. It can be installed using package managers like Homebrew and NPM, or downloaded directly. Crush supports various APIs like Anthropic, OpenAI, Groq, and Google Gemini, and allows for customization through environment variables. The tool can be configured locally or globally, and supports LSPs for additional context. Crush also provides options for ignoring files, allowing tools, and configuring local models. It respects `.gitignore` files and offers logging capabilities for troubleshooting and debugging.

LLM-Finetuning-Toolkit
LLM Finetuning toolkit is a config-based CLI tool for launching a series of LLM fine-tuning experiments on your data and gathering their results. It allows users to control all elements of a typical experimentation pipeline - prompts, open-source LLMs, optimization strategy, and LLM testing - through a single YAML configuration file. The toolkit supports basic, intermediate, and advanced usage scenarios, enabling users to run custom experiments, conduct ablation studies, and automate fine-tuning workflows. It provides features for data ingestion, model definition, training, inference, quality assurance, and artifact outputs, making it a comprehensive tool for fine-tuning large language models.

kaytu
Kaytu is an AI platform that enhances cloud efficiency by analyzing historical usage data and providing intelligent recommendations for optimizing instance sizes. Users can pay for only what they need without compromising the performance of their applications. The platform is easy to use with a one-line command, allows customization for specific requirements, and ensures security by extracting metrics from the client side. Kaytu is open-source and supports AWS services, with plans to expand to GCP, Azure, GPU optimization, and observability data from Prometheus in the future.

ReaLHF
ReaLHF is a distributed system designed for efficient RLHF training with Large Language Models (LLMs). It introduces a novel approach called parameter reallocation to dynamically redistribute LLM parameters across the cluster, optimizing allocations and parallelism for each computation workload. ReaL minimizes redundant communication while maximizing GPU utilization, achieving significantly higher Proximal Policy Optimization (PPO) training throughput compared to other systems. It supports large-scale training with various parallelism strategies and enables memory-efficient training with parameter and optimizer offloading. The system seamlessly integrates with HuggingFace checkpoints and inference frameworks, allowing for easy launching of local or distributed experiments. ReaLHF offers flexibility through versatile configuration customization and supports various RLHF algorithms, including DPO, PPO, RAFT, and more, while allowing the addition of custom algorithms for high efficiency.

docker-h5ai
docker-h5ai is a Docker image that provides a modern file indexer for HTTP web servers, enhancing file browsing with different views, a breadcrumb, and a tree overview. It is built on Alpine Linux with Nginx and PHP, supporting h5ai 0.30.0 and enabling PHP 8 JIT compiler. The image supports multiple architectures and can be used to host shared files with customizable configurations. Users can set up authentication using htpasswd and run the image as a real-time service. It is recommended to use HTTPS for data encryption when deploying the service.

MEGREZ
MEGREZ is a modern and elegant open-source high-performance computing platform that efficiently manages GPU resources. It allows for easy container instance creation, supports multiple nodes/multiple GPUs, modern UI environment isolation, customizable performance configurations, and user data isolation. The platform also comes with pre-installed deep learning environments, supports multiple users, features a VSCode web version, resource performance monitoring dashboard, and Jupyter Notebook support.
For similar jobs

weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

agentcloud
AgentCloud is an open-source platform that enables companies to build and deploy private LLM chat apps, empowering teams to securely interact with their data. It comprises three main components: Agent Backend, Webapp, and Vector Proxy. To run this project locally, clone the repository, install Docker, and start the services. The project is licensed under the GNU Affero General Public License, version 3 only. Contributions and feedback are welcome from the community.

oss-fuzz-gen
This framework generates fuzz targets for real-world `C`/`C++` projects with various Large Language Models (LLM) and benchmarks them via the `OSS-Fuzz` platform. It manages to successfully leverage LLMs to generate valid fuzz targets (which generate non-zero coverage increase) for 160 C/C++ projects. The maximum line coverage increase is 29% from the existing human-written targets.

LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.