Kolo

Kolo

A one stop shop for data generation, fine tuning and testing LLMs locally using the best tools available. Keeping it simple and versatile!

Stars: 195

Visit
 screenshot

Kolo is a lightweight tool for fast and efficient data generation, fine-tuning, and testing of Large Language Models (LLMs) on your local machine. It simplifies the fine-tuning and data generation process, runs locally without the need for cloud-based services, and supports popular LLM toolkits. Kolo is built using tools like Unsloth, Torchtune, Llama.cpp, Ollama, Docker, and Open WebUI. It requires Windows 10 OS or higher, Nvidia GPU with CUDA 12.1 capability, and 8GB+ VRAM, and 16GB+ system RAM. Users can join the Discord group for issues or feedback. The tool provides easy setup, training data generation, and integration with major LLM frameworks.

README:

Kolo

Kolo is a lightweight tool designed for fast and efficient data generation, fine-tuning and testing of Large Language Models (LLMs) on your local machine. It leverages cutting-edge tools to simplify the fine-tuning and data generation process, making it as quick and seamless as possible.

🚀 Features

  • Runs Locally: No need for cloud-based services; fine-tune models on your own machine.
  • 🛠 Easy Setup: Simple installation of all dependencies with Docker. No more wasting time setting up your own LLM development environment we already did it for you!
  • 📁 Generate Training Data: Generate synthetic QA training data using your text files quick and easy!
  • 🔌 Support for Popular Frameworks: Integrates with major LLM toolkits such as Unsloth, Torchtune, Llama.cpp, Ollama and Open WebUI.

🛠 Tools Used

Kolo is built using a powerful stack of LLM tools:

  • Unsloth – Open-source LLM fine-tuning; faster training, lower VRAM.
  • Torchtune – Native PyTorch library simplifying LLM fine-tuning workflows.
  • Llama.cpp – Fast C/C++ inference for Llama models.
  • Ollama – Portable, user-friendly LLM model management and deployment.
  • Docker – Containerized environment ensuring consistent, scalable deployments.
  • Open WebUI – Intuitive self-hosted web interface for LLM management.

Recommended System Requirements

  • Windows 10 OS or higher. Might work on Linux & Mac (Untested)
  • Nvidia GPU with CUDA 12.1 capability and 8GB+ of VRAM
  • 16GB+ System RAM

May work on other systems, your results may vary. Let us know!

Issues or Feedback

Join our Discord group!

🏃 Getting Started

1️⃣ Install Dependencies

Ensure HyperV is installed.

Ensure WSL 2 is installed; alternative guide.

Ensure Docker Desktop is installed.

2️⃣ Build the Image

./build_image.ps1

3️⃣ Run the Container

If running for first time:

./create_and_run_container.ps1

For subsequent runs:

./run_container.ps1

4️⃣ Copy Training Data

./copy_training_data.ps1 -f examples/God.jsonl -d data.jsonl

Don't have training data? Check out our synthetic QA data generation guide!

5️⃣ Train Model

Using Unsloth

./train_model_unsloth.ps1 -OutputDir "GodOutput" -Quantization "Q4_K_M" -TrainData "data.jsonl"

All available parameters

./train_model_unsloth.ps1 -Epochs 3 -LearningRate 1e-4 -TrainData "data.jsonl" -BaseModel "unsloth/Llama-3.2-1B-Instruct-bnb-4bit" -ChatTemplate "llama-3.1" -LoraRank 16 -LoraAlpha 16 -LoraDropout 0 -MaxSeqLength 1024 -WarmupSteps 10 -SaveSteps 500 -SaveTotalLimit 5 -Seed 1337 -SchedulerType "linear" -BatchSize 2 -OutputDir "GodOutput" -Quantization "Q4_K_M" -WeightDecay 0

Using Torchtune

Requirements: Create a Hugging Face account and create a token. You will also need to get permission from Meta to use their models. Search the Base Model name on Hugging Face website and get access before training.

./train_model_torchtune.ps1 -OutputDir "GodOutput" -Quantization "Q4_K_M" -TrainData "data.json" -HfToken "your_token"

All available parameters

./train_model_torchtune.ps1 -HfToken "your_token" -Epochs 3 -LearningRate 1e-4 -TrainData "data.json" -BaseModel "Meta-llama/Llama-3.2-1B-Instruct" -LoraRank 16 -LoraAlpha 16 -LoraDropout 0 -MaxSeqLength 1024 -WarmupSteps 10 -Seed 1337 -SchedulerType "cosine" -BatchSize 2 -OutputDir "GodOutput" -Quantization "Q4_K_M" -WeightDecay 0

Note: If re-training with the same OutputDir, delete the existing directory first:

./delete_model.ps1 "GodOutput" -Tool "unsloth|torchtune"

For more information about fine tuning parameters please refer to the Fine Tune Training Guide.

6️⃣ Install Model

Using Unsloth

./install_model.ps1 "God" -Tool "unsloth" -OutputDir "GodOutput" -Quantization "Q4_K_M"

Using Torchtune

./install_model.ps1 "God" -Tool "torchtune" -OutputDir "GodOutput" -Quantization "Q4_K_M"

7️⃣ Test Model

Open your browser and navigate to localhost:8080

Other Commands

Uninstalls the Model from Ollama.

./uninstall_model.ps1 "God"

Lists all models installed on Ollama and the training model directories for both torchtune and unsloth.

./list_models.ps1

Copies all the scripts and files inside /scripts into Kolo at /app/

./copy_scripts.ps1

Copies all the torchtune config files inside /torchtune into Kolo at /app/torchtune

./copy_configs.ps1

🔧 Advanced Users

SSH Access

To quickly SSH into the Kolo container for installing additional tools or running scripts directly:

./connect.ps1

If prompted for a password, use:

password 123

Alternatively, you can connect manually via SSH:

ssh root@localhost -p 2222

WinSCP (SFTP Access)

You can use WinSCP or any other SFTP file manager to access the Kolo container’s file system. This allows you to manage, modify, add, or remove scripts and files easily.

Connection Details:

  • Host: localhost
  • Port: 2222
  • Username: root
  • Password: 123

This setup ensures you can easily transfer files between your local machine and the container.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for Kolo

Similar Open Source Tools

For similar tasks

For similar jobs