kalavai-client

A platform to self-host AI on easy mode

Stars: 161

Visit

Kalavai is an open-source platform that transforms everyday devices into an AI supercomputer by aggregating resources from multiple machines. It facilitates matchmaking of resources for large AI projects, making AI hardware accessible and affordable. Users can create local and public pools, connect with the community's resources, and share computing power. The platform aims to be a management layer for research groups and organizations, enabling users to unlock the power of existing hardware without needing a devops team. Kalavai CLI tool helps manage both versions of the platform.

README:

⭐⭐⭐ Kalavai platform is open source, and free to use in both commercial and non-commercial purposes. If you find it useful, consider supporting us by giving a star to our GitHub project, joining our discord channel and follow our Substack.

Kalavai: a platform to self-host AI on easy mode

AI in the cloud is not aligned with you, it's aligned with the company that owns it. Make sure you own your AI

Taming the adoption of self-hosted GenAI

Kalavai is an open source tool that turns any devices into a self-hosted AI platform. It aggregates resources from multiple machines, including cloud, on prem and personal computers, and is compatible with most model engines to make model deployment and orchestration simple and reliable.

What can Kalavai do?

Kalavai's goal is to make using self-hosted AI (GenAI models and agents) in real applications accessible and affordable to all.

Core features

Manage multiple devices resources as one, wherever they come from (hybrid cloud, on prem, personal devices)
Deploy open source models seamlessly across devices, with zero-cost migration
Beyond LLMs: not just for large language models, but text-to-speech, speech-to-text, image generation, video understanding, coding generation and embedding models.
Production-ready: models are automatically exposed through a single OpenAI-like API and a ChatGPT-like UI playground, with off-the-shelf monitoring and evaluation framework.
Compatible with most popular model engines
Easy to expand to custom workloads

Powered by Kalavai

CoGen AI: A community hosted alternative to OpenAI API for unlimited inference.
Create your own Free Cursor/Windsurf Clone

Latest updates

11 June 2025: Native support for Mac and Raspberry pi devices (ARM).
20 February 2025: New shiny GUI interface to control LLM pools and deploy models
31 January 2025: kalavai-client is now a PyPI package, easier to install than ever!

More news

27 January 2025: Support for accessing pools from remote computers
9 January 2025: Added support for Aphrodite Engine models
8 January 2025: Release of a free, public, shared pool for community LLM deployment
24 December 2024: Release of public BOINC pool to donate computing to scientific projects
23 December 2024: Release of public petals swarm
24 November 2024: Common pools with private user spaces
30 October 2024: Release of our public pool platform

Support for AI engines

We currently support out of the box the following AI engines:

vLLM: most popular GPU-based model inference.
llama.cpp: CPU-based GGUF model inference.
SGLang: Super fast GPU-based model inference.
n8n: no-code workload automation framework.
Flowise: no-code agentic AI workload framework.
Speaches: audio (speech-to-text and text-to-speech) model inference.
Langfuse: open source evaluation and monitoring GenAI framework.
OpenWebUI: ChatGPT-like UI playground to interface with any models.

Coming soon:

Not what you were looking for? Tell us what engines you'd like to see.

Kalavai is at an early stage of its development. We encourage people to use it and give us feedback! Although we are trying to minimise breaking changes, these may occur until we have a stable version (v1.0).

Want to know more?

Get a free Kalavai account and access unlimited AI.
Full documentation for the project.
Join our Substack for updates and be part of our community
Join our discord community

Getting started

The kalavai-client is the main tool to interact with the Kalavai platform, to create and manage both local and public pools and also to interact with them (e.g. deploy models). Let's go over its installation.

Requirements

For seed nodes:

A 64 bits x86 based Linux machine (laptop, desktop or VM)
Docker engine installed with privilege access.

For workers sharing resources with the pool:

A laptop, desktop or Virtual Machine (MacOS, Linux or Windows; ARM or x86)
If self-hosting, workers should be on the same network as the seed node. Looking for over-the-internet connectivity? Check out our managed seeds
Docker engine installed (for linux, Windows and MacOS) with privilege access.

Install the client

The client is a python package and can be installed with one command:

pip install kalavai-client

Create a a local, private AI pool

You can create and manage your pools with the new kalavai GUI, which can be started with:

kalavai gui start

This will expose the GUI and the backend services in localhost. By default, the GUI is accessible via http://localhost:49153. In the UI users can create and join AI pools, monitor devices, deploy LLMs and more.

Check out our getting started guide for next steps on how to add more workers to your pool, or use our managed seeds service for over-the-internet AI pools.

Enough already, let's run stuff!

For an end to end tour on building your own OpenAI-like service, check our self-hosted guide.

Check our examples to put your new AI pool to good use!

Compatibility matrix

If your system is not currently supported, open an issue and request it. We are expanding this list constantly.

Hardware and OS compatibility

OS compatibility

Currently seed nodes are supported exclusively on linux machines (x86_64 platform). However Kalavai supports mix-pools, i.e. having Windows and MacOS computers as workers.

Since worker nodes run inside docker, any machine that can run docker should be compatible with Kalavai. Here are instructions for linux, Windows and MacOS.

The kalavai client, which controls and access pools, can be installed on any machine that has python 3.10+.

Hardware compatibility:

amd64 or x86_64 CPU architecture for seed and worker nodes.
arm64 CPU architecture for worker nodes.
NVIDIA GPU
Mac M series, AMD and Intel GPUs are currently not supported (interested in helping us test it?)

Roadmap

[x] Kalavai client on Linux
[x] [TEMPLATE] Distributed LLM deployment
[x] Kalavai client on Windows (worker only)
[x] Kalavai client on Windows WSL2 (seed and worker)
[x] Self-hosted LLM pools
[x] Collaborative LLM deployment
[x] Ray cluster support
[x] Kalavai client on Mac (worker only)
[x] Kalavai pools UI
[ ] Support for AMD GPUs
[ ] Support for Mac M GPUs
[x] Docker install path

Anything missing here? Give us a shout in the discussion board

Contribute

PR welcome!
Join the community and share ideas!
Report bugs, issues and new features.
Help improve our compatibility matrix by testing on different operative systems.
Follow our Substack channel for news, guides and more.
Community integrations are template jobs built by Kalavai and the community that makes deploying distributed workflows easy for users. Anyone can extend them and contribute to the repo.

Star History

Build from source

Expand

Python version >= 3.10.

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install python3.10 python3.10-dev python3-virtualenv python3-venv
virtualenv -p python3.10 env
source env/bin/activate
sudo apt install  python3.10-venv python3.10-dev -y
pip install -U setuptools
pip install -e .[dev]

Build python wheels:

bash publish.sh build

Unit tests

To run the unit tests, use:

python -m unittest

For Tasks:

Click tags to check more tools for each tasks

create local pool join public pool share computing resources monitor connected nodes run distributed computation

For Jobs:

ai engineer data scientist research scientist machine learning engineer software developer

Alternative AI tools for kalavai-client

Similar Open Source Tools

kalavai-client

github

: 161

dify

Dify is an open-source LLM app development platform that combines AI workflow, RAG pipeline, agent capabilities, model management, observability features, and more. It allows users to quickly go from prototype to production. Key features include: 1. Workflow: Build and test powerful AI workflows on a visual canvas. 2. Comprehensive model support: Seamless integration with hundreds of proprietary / open-source LLMs from dozens of inference providers and self-hosted solutions. 3. Prompt IDE: Intuitive interface for crafting prompts, comparing model performance, and adding additional features. 4. RAG Pipeline: Extensive RAG capabilities that cover everything from document ingestion to retrieval. 5. Agent capabilities: Define agents based on LLM Function Calling or ReAct, and add pre-built or custom tools. 6. LLMOps: Monitor and analyze application logs and performance over time. 7. Backend-as-a-Service: All of Dify's offerings come with corresponding APIs for easy integration into your own business logic.

github

: 115.2k

colors_ai

Colors AI is a cross-platform color scheme generator that uses deep learning from public API providers. It is available for all mainstream operating systems, including mobile. Features: - Choose from open APIs, with the ability to set up custom settings - Export section with many export formats to save or clipboard copy - URL providers to other static color generators - Localized to several languages - Dark and light theme - Material Design 3 - Data encryption - Accessibility - And much more

github

: 117

StratosphereLinuxIPS

Slips is a powerful endpoint behavioral intrusion prevention and detection system that uses machine learning to detect malicious behaviors in network traffic. It can work with network traffic in real-time, PCAP files, and network flows from tools like Suricata, Zeek/Bro, and Argus. Slips threat detection is based on machine learning models, threat intelligence feeds, and expert heuristics. It gathers evidence of malicious behavior and triggers alerts when enough evidence is accumulated. The tool is Python-based and supported on Linux and MacOS, with blocking features only on Linux. Slips relies on Zeek network analysis framework and Redis for interprocess communication. It offers a graphical user interface for easy monitoring and analysis.

github

: 691

langflow

Langflow is an open-source Python-powered visual framework designed for building multi-agent and RAG applications. It is fully customizable, language model agnostic, and vector store agnostic. Users can easily create flows by dragging components onto the canvas, connect them, and export the flow as a JSON file. Langflow also provides a command-line interface (CLI) for easy management and configuration, allowing users to customize the behavior of Langflow for development or specialized deployment scenarios. The tool can be deployed on various platforms such as Google Cloud Platform, Railway, and Render. Contributors are welcome to enhance the project on GitHub by following the contributing guidelines.

github

: 124.5k

gpt4all

GPT4All is an ecosystem to run powerful and customized large language models that work locally on consumer grade CPUs and any GPU. Note that your CPU needs to support AVX or AVX2 instructions. Learn more in the documentation. A GPT4All model is a 3GB - 8GB file that you can download and plug into the GPT4All open-source ecosystem software. Nomic AI supports and maintains this software ecosystem to enforce quality and security alongside spearheading the effort to allow any person or enterprise to easily train and deploy their own on-edge large language models.

github

: 72.9k

Open-LLM-VTuber

Open-LLM-VTuber is a voice-interactive AI companion supporting real-time voice conversations and featuring a Live2D avatar. It can run offline on Windows, macOS, and Linux, offering web and desktop client modes. Users can customize appearance and persona, with rich LLM inference, text-to-speech, and speech recognition support. The project is highly customizable, extensible, and actively developed with exciting features planned. It provides privacy with offline mode, persistent chat logs, and various interaction features like voice interruption, touch feedback, Live2D expressions, pet mode, and more.

github

: 2.9k

docq

Docq is a private and secure GenAI tool designed to extract knowledge from business documents, enabling users to find answers independently. It allows data to stay within organizational boundaries, supports self-hosting with various cloud vendors, and offers multi-model and multi-modal capabilities. Docq is extensible, open-source (AGPLv3), and provides commercial licensing options. The tool aims to be a turnkey solution for organizations to adopt AI innovation safely, with plans for future features like more data ingestion options and model fine-tuning.

github

: 51

agentgateway

Agentgateway is an open source data plane optimized for agentic AI connectivity within or across any agent framework or environment. It provides drop-in security, observability, and governance for agent-to-agent and agent-to-tool communication, supporting leading interoperable protocols like Agent2Agent (A2A) and Model Context Protocol (MCP). Highly performant, security-first, multi-tenant, dynamic, and supporting legacy API transformation, agentgateway is designed to handle any scale and run anywhere with any agent framework.

github

: 961

genai-os

Kuwa GenAI OS is an open, free, secure, and privacy-focused Generative-AI Operating System. It provides a multi-lingual turnkey solution for GenAI development and deployment on Linux and Windows. Users can enjoy features such as concurrent multi-chat, quoting, full prompt-list import/export/share, and flexible orchestration of prompts, RAGs, bots, models, and hardware/GPUs. The system supports various environments from virtual hosts to cloud, and it is open source, allowing developers to contribute and customize according to their needs.

github

: 89

langgraph-mcp-agents

LangGraph Agent with MCP is a toolkit provided by LangChain AI that enables AI agents to interact with external tools and data sources through the Model Context Protocol (MCP). It offers a user-friendly interface for deploying ReAct agents to access various data sources and APIs through MCP tools. The toolkit includes features such as a Streamlit Interface for interaction, Tool Management for adding and configuring MCP tools dynamically, Streaming Responses in real-time, and Conversation History tracking.

github

: 78

tuff

Tuff is a local-first, AI-native, and infinitely extensible desktop command center designed to enhance workflow efficiency. It offers a seamless integration of core utilities, AI-powered search, contextual intelligence, and extensibility through custom plugins. With a beautiful UI design, rich functionality, simple operations, and a focus on security and reliability, Tuff provides users with a cross-platform desktop software that is easy to use and offers a good user experience.

github

: 139

LLM-Zero-to-Hundred

LLM-Zero-to-Hundred is a repository showcasing various applications of LLM chatbots and providing insights into training and fine-tuning Language Models. It includes projects like WebGPT, RAG-GPT, WebRAGQuery, LLM Full Finetuning, RAG-Master LLamaindex vs Langchain, open-source-RAG-GEMMA, and HUMAIN: Advanced Multimodal, Multitask Chatbot. The projects cover features like ChatGPT-like interaction, RAG capabilities, image generation and understanding, DuckDuckGo integration, summarization, text and voice interaction, and memory access. Tutorials include LLM Function Calling and Visualizing Text Vectorization. The projects have a general structure with folders for README, HELPER, .env, configs, data, src, images, and utils.

github

: 180

beeai

BeeAI is an open platform that helps users discover, run, and compose AI agents from any framework and language. It offers a framework-agnostic approach, allowing seamless integration of AI agents regardless of the language or platform. Users can build complex workflows using simple building blocks, explore a catalog of powerful agents with integrated search, and benefit from the BeeAI ecosystem with first-class support for Python and TypeScript agent developers.

github

: 396

clearml

ClearML is a suite of tools designed to streamline the machine learning workflow. It includes an experiment manager, MLOps/LLMOps, data management, and model serving capabilities. ClearML is open-source and offers a free tier hosting option. It supports various ML/DL frameworks and integrates with Jupyter Notebook and PyCharm. ClearML provides extensive logging capabilities, including source control info, execution environment, hyper-parameters, and experiment outputs. It also offers automation features, such as remote job execution and pipeline creation. ClearML is designed to be easy to integrate, requiring only two lines of code to add to existing scripts. It aims to improve collaboration, visibility, and data transparency within ML teams.

github

: 5.8k

blinko

Blinko is an innovative open-source project designed for individuals who want to quickly capture and organize their fleeting thoughts. It allows users to seamlessly jot down ideas, ensuring no spark of creativity is lost. With AI-enhanced note retrieval, data ownership, efficient and fast note-taking, lightweight architecture, and open collaboration, Blinko offers a robust platform for managing and accessing notes effortlessly.

github

: 5.9k

For similar tasks

kalavai-client

github

: 161

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 668

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k