aikit

aikit

🏗️ Fine-tune, build, and deploy open-source LLMs easily!

Stars: 425

Visit
 screenshot

AIKit is a one-stop shop to quickly get started to host, deploy, build and fine-tune large language models (LLMs). AIKit offers two main capabilities: Inference: AIKit uses LocalAI, which supports a wide range of inference capabilities and formats. LocalAI provides a drop-in replacement REST API that is OpenAI API compatible, so you can use any OpenAI API compatible client, such as Kubectl AI, Chatbot-UI and many more, to send requests to open-source LLMs! Fine Tuning: AIKit offers an extensible fine tuning interface. It supports Unsloth for fast, memory efficient, and easy fine-tuning experience.

README:

AIKit ✨


AIKit is a comprehensive platform to quickly get started to host, deploy, build and fine-tune large language models (LLMs).

AIKit offers two main capabilities:

  • Inference: AIKit uses LocalAI, which supports a wide range of inference capabilities and formats. LocalAI provides a drop-in replacement REST API that is OpenAI API compatible, so you can use any OpenAI API compatible client, such as Kubectl AI, Chatbot-UI and many more, to send requests to open LLMs!

  • Fine-Tuning: AIKit offers an extensible fine-tuning interface. It supports Unsloth for fast, memory efficient, and easy fine-tuning experience.

👉 For full documentation, please see AIKit website!

Features

Quick Start

You can get started with AIKit quickly on your local machine without a GPU!

docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:8b

After running this, navigate to http://localhost:8080/chat to access the WebUI!

API

AIKit provides an OpenAI API compatible endpoint, so you can use any OpenAI API compatible client to send requests to open LLMs!

curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
    "model": "llama-3.1-8b-instruct",
    "messages": [{"role": "user", "content": "explain kubernetes in a sentence"}]
  }'

Output should be similar to:

{
  // ...
    "model": "llama-3.1-8b-instruct",
    "choices": [
        {
            "index": 0,
            "finish_reason": "stop",
            "message": {
                "role": "assistant",
                "content": "Kubernetes is an open-source container orchestration system that automates the deployment, scaling, and management of applications and services, allowing developers to focus on writing code rather than managing infrastructure."
            }
        }
    ],
  // ...
}

That's it! 🎉 API is OpenAI compatible so this is a drop-in replacement for any OpenAI API compatible client.

Pre-made Models

AIKit comes with pre-made models that you can use out-of-the-box!

If it doesn't include a specific model, you can always create your own images, and host in a container registry of your choice!

CPU

[!NOTE] AIKit supports both AMD64 and ARM64 CPUs. You can run the same command on either architecture, and Docker will automatically pull the correct image for your CPU.

Depending on your CPU capabilities, AIKit will automatically select the most optimized instruction set.

Model Optimization Parameters Command Model Name License
🦙 Llama 3.2 Instruct 1B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.2:1b llama-3.2-1b-instruct Llama
🦙 Llama 3.2 Instruct 3B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.2:3b llama-3.2-3b-instruct Llama
🦙 Llama 3.1 Instruct 8B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:8b llama-3.1-8b-instruct Llama
🦙 Llama 3.3 Instruct 70B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.3:70b llama-3.3-70b-instruct Llama
Ⓜ️ Mixtral Instruct 8x7B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b mixtral-8x7b-instruct Apache
🅿️ Phi 3.5 Instruct 3.8B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/phi3.5:3.8b phi-3.5-3.8b-instruct MIT
🔡 Gemma 2 Instruct 2B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/gemma2:2b gemma-2-2b-instruct Gemma
⌨️ Codestral 0.1 Code 22B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/codestral:22b codestral-22b MNLP
QwQ 32B docker run -d --rm -p 8080:8080 ghcr.io/sozercan/qwq:32b qwq-32b-preview Apache 2.0

NVIDIA CUDA

[!NOTE] To enable GPU acceleration, please see GPU Acceleration.

Please note that only difference between CPU and GPU section is the --gpus all flag in the command to enable GPU acceleration.

Model Optimization Parameters Command Model Name License
🦙 Llama 3.2 Instruct 1B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.2:1b llama-3.2-1b-instruct Llama
🦙 Llama 3.2 Instruct 3B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.2:3b llama-3.2-3b-instruct Llama
🦙 Llama 3.1 Instruct 8B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.1:8b llama-3.1-8b-instruct Llama
🦙 Llama 3.3 Instruct 70B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.3:70b llama-3.3-70b-instruct Llama
Ⓜ️ Mixtral Instruct 8x7B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b mixtral-8x7b-instruct Apache
🅿️ Phi 3.5 Instruct 3.8B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/phi3.5:3.8b phi-3.5-3.8b-instruct MIT
🔡 Gemma 2 Instruct 2B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/gemma2:2b gemma-2-2b-instruct Gemma
⌨️ Codestral 0.1 Code 22B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/codestral:22b codestral-22b MNLP
QwQ 32B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/qwq:32b qwq-32b-preview Apache 2.0
📸 Flux 1 Dev Text to image 12B docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/flux1:dev flux-1-dev FLUX.1 [dev] Non-Commercial License

Apple Silicon (experimental)

[!NOTE] To enable GPU acceleration on Apple Silicon, please see Podman Desktop documentation. For more information, please see GPU Acceleration.

Apple Silicon is an experimental runtime and it may change in the future. This runtime is specific to Apple Silicon only, and it will not work as expected on other architectures, including Intel Macs.

Only gguf models are supported on Apple Silicon.

Model Optimization Parameters Command Model Name License
🦙 Llama 3.2 Instruct 1B podman run -d --rm --device /dev/dri -p 8080:8080 ghcr.io/sozercan/applesilicon/llama3.2:1b llama-3.2-1b-instruct Llama
🦙 Llama 3.2 Instruct 3B podman run -d --rm --device /dev/dri -p 8080:8080 ghcr.io/sozercan/applesilicon/llama3.2:3b llama-3.2-3b-instruct Llama
🦙 Llama 3.1 Instruct 8B podman run -d --rm --device /dev/dri -p 8080:8080 ghcr.io/sozercan/applesilicon/llama3.1:8b llama-3.1-8b-instruct Llama
🅿️ Phi 3.5 Instruct 3.8B podman run -d --rm --device /dev/dri -p 8080:8080 ghcr.io/sozercan/applesilicon/phi3.5:3.8b phi-3.5-3.8b-instruct MIT
🔡 Gemma 2 Instruct 2B podman run -d --rm --device /dev/dri -p 8080:8080 ghcr.io/sozercan/applesilicon/gemma2:2b gemma-2-2b-instruct Gemma

What's next?

👉 For more information and how to fine tune models or create your own images, please see AIKit website!

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for aikit

Similar Open Source Tools

For similar tasks

For similar jobs