LLM-Alchemy-Chamber

LLM-Alchemy-Chamber

a friendly neighborhood repository with diverse experiments and adventures in the world of LLMs

Stars: 117

Visit
 screenshot

LLM Alchemy Chamber is a repository dedicated to exploring the world of Language Models (LLMs) through various experiments and projects. It contains scripts, notebooks, and experiments focused on tasks such as fine-tuning different LLM models, quantization for performance optimization, dataset generation for instruction/QA tasks, and more. The repository offers a collection of resources for beginners and enthusiasts interested in delving into the mystical realm of LLMs.

README:

LLM Alchemy Chamber πŸ§™β€β™‚οΈβœ¨

Welcome to a friendly neighborhood repository featuring diverse experiments and adventures in the world of LLMs. This collection is no ordinary repository; it's an alchemical blend of scripts, notebooks, and experiments dedicated to the mystical realm of Language Models (LLMs).

Alchemical Scripts

Projects GitHub Link Colab Link Blog Link Description
Youtube Cloner Folder Fireship GPT Blog coming soon An Attempt at cloning youtubers using LLMs by Finetuning
Finetuning GitHub Link Colab Link Blog Link Description
Gemma Finetuning GitHub Colab A Beginner’s Guide to Fine-Tuning Gemma Notebook to Finetune Gemma Models
Mistral-7b Finetuning GitHub Colab A Beginner’s Guide to Fine-Tuning Mistral 7B Instruct Model Notebook to Finetune Mistral-7b Model
Mixtral Finetuning GitHub Colab A Beginner’s Guide to Fine-Tuning Mixtral Instruct Model Notebook to Finetune Mixtral-7b Models
LLama2 Finetuning GitHub Colab Notebook to Finetune Llama2-7b Model
Quantization GitHub Link Colab Link Blog Link Description
AWQ Quantization GitHub Colab Squeeze Every Drop of Performance from Your LLM with AWQ quantise LLM using AWQ.
GGUF Quantization GitHub Colab Run any Huggingface model locally quantise LLM to GGUF formate.
Data Prep GitHub Link Colab Link Description
Documents -> Dataset GitHub Colab Given Documents generate Instruction/QA dataset for finetuning LLMs
Topic -> Dataset GitHub Colab Given a Topic generate a dataset to finetune LLMs
Alpaca Dataset Generation GitHub Colab The original implementation of generating instruction dataset followed in the alpaca paper

Repo Structure

β”œβ”€β”€ DataPrep (Notebook to generate synthetic data)
β”‚   β”œβ”€β”€ dataset_prep.ipynb
β”‚   └── ...
β”œβ”€β”€ Deployment (TGI/VLLM scripts for testing)
β”‚   └── ...
β”œβ”€β”€ Finetuning (Finalized Finetuning Scripts)
β”‚   β”œβ”€β”€ Gemma_finetuning_notebook.ipynb
β”‚   β”œβ”€β”€ Llama2_finetuning_notebook.ipynb
β”‚   β”œβ”€β”€ Mistral_finetuning_notebook.ipynb
β”‚   β”œβ”€β”€ Mixtral_finetuning_notebook.ipynb
β”‚   └── ...
β”œβ”€β”€ LLMS (LLM experiments)
β”‚   β”œβ”€β”€ ambari
β”‚   β”‚   └── ...
β”‚   β”œβ”€β”€ CodeLLama
β”‚   β”‚   └── ...
β”‚   β”œβ”€β”€ Gemma
β”‚   β”‚   β”œβ”€β”€ finetune-gemma.ipynb
β”‚   β”‚   └── gemma-sft.py
β”‚   β”œβ”€β”€ Llama2
β”‚   β”‚   └── ...
β”‚   β”œβ”€β”€ Mistral-7b
β”‚   β”‚   └── ...
β”‚   └── Mixtral
β”‚       └── ...
β”œβ”€β”€ Projects (Upcoming ideas to explore)
β”‚   └── YT_Clones
β”‚       β”œβ”€β”€ Fireship_clone.ipynb
β”‚       β”œβ”€β”€ youtube_channel_scraper.py
β”‚       └── ...
β”œβ”€β”€ Quantization
β”‚   └── ...
β”œβ”€β”€ utils
β”‚   └── streaming_inference_hf.ipynb
└── RAG (Retrieval Augmented Generation)
    β”œβ”€β”€ 1_Naive_RAG.ipynb
    β”œβ”€β”€ 2_Semantic_Chunking_RAG.ipynb
    β”œβ”€β”€ 3_Sentence_Window_Retrieval_RAG.ipynb
    β”œβ”€β”€ 4_Auto_Merging_Retrieval_RAG.ipynb
    β”œβ”€β”€ 5_Agentic_RAG.ipynb
    └── 6_Visual_RAG.ipynb

Star History Chart

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for LLM-Alchemy-Chamber

Similar Open Source Tools

For similar tasks

For similar jobs