Stable-Diffusion
FLUX, Stable Diffusion, SDXL, SD3, LoRA, Fine Tuning, DreamBooth, Training, Automatic1111, Forge WebUI, SwarmUI, DeepFake, TTS, Animation, Text To Video, Tutorials, Guides, Lectures, Courses, ComfyUI, Google Colab, RunPod, Kaggle, NoteBooks, ControlNet, TTS, Voice Cloning, AI, AI News, ML, ML News, News, Tech, Tech News, Kohya, Midjourney, RunPod
Stars: 2212
Stable Diffusion is a text-to-image AI model that can generate realistic images from a given text prompt. It is a powerful tool that can be used for a variety of creative and practical applications, such as generating concept art, creating illustrations, and designing products. Stable Diffusion is also a great tool for learning about AI and machine learning. This repository contains a collection of tutorials and resources on how to use Stable Diffusion.
README:
Greetings everyone. I am Dr. Furkan Gözükara. I am an Assistant Professor in Software Engineering department of a private university (have PhD in Computer Engineering).
My LinkedIn : https://www.linkedin.com/in/furkangozukara
My Twitter : https://twitter.com/GozukaraFurkan
My Linktr : https://linktr.ee/FurkanGozukara
My Mastodon : https://mastodon.social/@furkangozukara
Our channel address (37,000+ subscribers) if you like to subscribe
Our discord (8,000+ members) to get more help
Our 1,900+ Stars GitHub Stable Diffusion and other tutorials repo
I am keeping this list up-to-date. I got upcoming new awesome video ideas. Trying to find time to do that.
I am open to any criticism you have. I am constantly trying to improve the quality of my tutorial guide videos. Please leave comments with both your suggestions and what you would like to see in future videos.
All videos have manually fixed subtitles and properly prepared video chapters. You can watch with these perfect subtitles or look for the chapters you are interested in.
Since my profession is teaching, I usually do not skip any of the important parts. Therefore, you may find my videos a little bit longer.
Playlist link on YouTube: ### Stable Diffusion Tutorials, Automatic1111 Web UI & Google Colab Guides, DreamBooth, Textual Inversion / Embedding, LoRA, AI Upscaling, Video to Anime
1.) Automatic1111 Web UI - PC - Free
2.) Automatic1111 Web UI - PC - Free
3.) Automatic1111 Web UI - PC - Free
4.) Automatic1111 Web UI - PC - Free
5.) Automatic1111 Web UI - PC - Free
DreamBooth Got Buffed - 22 January Update - Much Better Success Train Stable Diffusion Models Web UI
6.) Automatic1111 Web UI - PC - Free
7.) Automatic1111 Web UI - PC - Free
How To Do Stable Diffusion LORA Training By Using Web UI On Different Models - Tested SD 1.5, SD 2.1
8.) Automatic1111 Web UI - PC - Free
8 GB LoRA Training - Fix CUDA & xformers For DreamBooth and Textual Inversion in Automatic1111 SD UI
9.) Automatic1111 Web UI - PC - Free
How To Do Stable Diffusion Textual Inversion (TI) / Text Embeddings By Automatic1111 Web UI Tutorial
10.) Automatic1111 Web UI - PC - Free
How To Generate Stunning Epic Text By Stable Diffusion AI - No Photoshop - For Free - Depth-To-Image
11.) Python Code - Hugging Face Diffusers Script - PC - Free
12.) NMKD Stable Diffusion GUI - Open Source - PC - Free
Forget Photoshop - How To Transform Images With Text Prompts using InstructPix2Pix Model in NMKD GUI
13.) Google Colab Free - Cloud - No PC Is Required
14.) Google Colab Free - Cloud - No PC Is Required
Stable Diffusion Google Colab, Continue, Directory, Transfer, Clone, Custom Models, CKPT SafeTensors
15.) Automatic1111 Web UI - PC - Free
Become A Stable Diffusion Prompt Master By Using DAAM - Attention Heatmap For Each Used Token - Word
16.) Python Script - Gradio Based - ControlNet - PC - Free
17.) Automatic1111 Web UI - PC - Free
18.) RunPod - Automatic1111 Web UI - Cloud - Paid - No PC Is Required
19.) RunPod - Automatic1111 Web UI - Cloud - Paid - No PC Is Required
How To Install DreamBooth & Automatic1111 On RunPod & Latest Libraries - 2x Speed Up - cudDNN - CUDA
20.) Automatic1111 Web UI - PC - Free
Fantastic New ControlNet OpenPose Editor Extension & Image Mixing - Stable Diffusion Web UI Tutorial
21.) Automatic1111 Web UI - PC - Free
Automatic1111 Stable Diffusion DreamBooth Guide: Optimal Classification Images Count Comparison Test
22.) Automatic1111 Web UI - PC - Free
Epic Web UI DreamBooth Update - New Best Settings - 10 Stable Diffusion Training Compared on RunPods
23.) Automatic1111 Web UI - PC - Free
New Style Transfer Extension, ControlNet of Automatic1111 Stable Diffusion T2I-Adapter Color Control
24.) Automatic1111 Web UI - PC - Free
25.) Automatic1111 Web UI - PC - Free
26.) Automatic1111 Web UI - PC - Free
27.) Automatic1111 Web UI - PC - Free
28.) Python Script - Jupyter Based - PC - Free
Midjourney Level NEW Open Source Kandinsky 2.1 Beats Stable Diffusion - Installation And Usage Guide
29.) Automatic1111 Web UI - PC - Free
30.) Kohya Web UI - Automatic1111 Web UI - PC - Free
31.) Kaggle NoteBook (Cloud) - Free
DeepFloyd IF By Stability AI - Is It Stable Diffusion XL or Version 3? We Review and Show How To Use
32.) Python Script - Automatic1111 Web UI - PC - Free
How To Find Best Stable Diffusion Generated Images By Using DeepFace AI - DreamBooth / LoRA Training
33.) PC - Google Colab (Cloud) - Free
34.) Automatic1111 Web UI - PC - Free
35.) Automatic1111 Web UI - PC - Free
36.) Automatic1111 Web UI - PC - Free
Stable Diffusion 2 NEW Image Post Processing Scripts And Best Class / Regularization Images Datasets
37.) Automatic1111 Web UI - PC - Free
38.) RunPod - Automatic1111 Web UI - Cloud - Paid - No PC Is Required
How To Install DreamBooth & Automatic1111 On RunPod & Latest Libraries - 2x Speed Up - cudDNN - CUDA
39.) Automatic1111 Web UI - PC - Free + RunPod (Cloud)
40.) Automatic1111 Web UI - PC - Free + RunPod (Cloud)
41.) Google Colab - Gradio - Free - Cloud
42.) Local - PC - Free - Gradio
43.) Cloud - RunPod
How To Use SDXL On RunPod Tutorial. Auto Installer & Refiner & Amazing Native Diffusers Based Gradio
44.) Local - PC - Free - Google Colab (Cloud) - RunPod (Cloud) - Custom Web UI
45.) Local - PC - Free - RunPod (Cloud)
46.) Local - PC - Free
How To Use SDXL in Automatic1111 Web UI - SD Web UI vs ComfyUI - Easy Local Install Tutorial / Guide
47.) Cloud - RunPod - Paid
48.) Local - PC - Free
49.) Cloud - RunPod - Paid
50.) Cloud - Kaggle - Free
51.) Cloud - Kaggle - Free
How Use Stable Diffusion, SDXL, ControlNet, LoRAs For FREE Without A GPU On Kaggle Like Google Colab
52.) Windows - Free
53.) RunPod - Cloud - Paid
54.) Local - PC - Free
55.) RunPod - Cloud - Paid
56.) Local - PC - Free
57.) Local - PC - Free
58.) Cloud - Kaggle (Cloud) - Free
How To Do Stable Diffusion XL (SDXL) DreamBooth Training For Free - Utilizing Kaggle - Easy Tutorial
59.) Free - Local - RunPod (Cloud)
PIXART-α : First Open Source Rival to Midjourney - Better Than Stable Diffusion SDXL - Full Tutorial
60.) Free - Local - PC
61.) Free - Local - PC & RunPod (Cloud)
62.) Free - Local - PC - RunPod (Cloud) - Kaggle (Cloud)
Instantly Transfer Face By Using IP-Adapter-FaceID: Full Tutorial & GUI For Windows, RunPod & Kaggle
63.) Free - Local - PC - RunPod (Cloud) - Kaggle (Cloud)
Detailed Comparison of 160+ Best Stable Diffusion 1.5 Custom Models & 1 Click Script to Download All
64.) Free - Local - PC - RunPod (Cloud)
SUPIR: New SOTA Open Source Image Upscaler & Enhancer Model Better Than Magnific & Topaz AI Tutorial
65.) Free - Local - PC - Massed Compute (Cloud)
Full Stable Diffusion SD & XL Fine Tuning Tutorial With OneTrainer On Windows & Cloud - Zero To Hero
66.) Free - Local - PC - Cloud - Extension
67.) Free - Local - PC
68.) Free - Local - PC
IDM-VTON: The Most Amazing Virtual Clothing Try On Application - Open Source - 1 Click Install & Use
69.) Free & Paid - Cloud - RunPod - Massed Compute - Kaggle
70.) Free - Local - PC
71.) Free & Paid - Cloud - RunPod - Massed Compute - Kaggle
72.) Free All Platforms
72.) Free All Platforms
73.) Free All Platforms
74.) Free - Local - PC
Mind-Blowing Deepfake Tutorial: Turn Anyone into Your Fav Movie Star! Better than Roop & Face Fusion
75.) Massed Compute (Cloud)
Best Deepfake Open Source App ROPE - So Easy To Use Full HD Feceswap DeepFace, No GPU Required Cloud
76.) Free - Local - PC
V-Express: 1-Click AI Avatar Talking Heads Video Animation Generator - D-ID Alike - Free Open Source
77.) Free & Paid - Cloud - RunPod - Massed Compute - Kaggle
78.) Free - Local - PC
79.) Free & Paid - Cloud - RunPod - Massed Compute - Kaggle
80.) Free - Local - PC
81.) Free & Paid - Cloud - RunPod - Massed Compute - Kaggle
82.) Free & Paid - Cloud
83.) Free & Paid - Cloud - RunPod - Massed Compute - Kaggle
FLUX: The First Ever Open Source txt2img Model Truly Beats Midjourney & Others - FLUX is Awaited SD3
84.) Paid Cloud Service
85.) Free - Local - Windows
FLUX LoRA Training Simplified: From Zero to Hero with Kohya SS GUI (8GB GPU, Windows) Tutorial Guide
86.) Paid - Cloud - RunPod - Massed Compute
Blazing Fast & Ultra Cheap FLUX LoRA Training on Massed Compute & RunPod Tutorial - No GPU Required!
87.) Free - Windows - Paid - Cloud - RunPod - Massed Compute
88.) Free - Windows
89.) Paid - Cloud
90.) Free - All Platforms
91.) Free - Windows and Cloud
92.) Free - Windows and Cloud
93.) Free - Windows and Cloud
94.) Free - Windows and Cloud
FLUX Tools Outpainting, Inpainting (Fill), Redux, Depth & Canny Ultimate Tutorial Guide with SwarmUI
95.) Free - Windows and Cloud
96.) Free - Windows and Cloud
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for Stable-Diffusion
Similar Open Source Tools
Stable-Diffusion
Stable Diffusion is a text-to-image AI model that can generate realistic images from a given text prompt. It is a powerful tool that can be used for a variety of creative and practical applications, such as generating concept art, creating illustrations, and designing products. Stable Diffusion is also a great tool for learning about AI and machine learning. This repository contains a collection of tutorials and resources on how to use Stable Diffusion.
chatgpt-plus
ChatGPT-PLUS is an open-source AI assistant solution based on AI large language model API, with a built-in operational management backend for easy deployment. It integrates multiple large language models from platforms like OpenAI, Azure, ChatGLM, Xunfei Xinghuo, and Wenxin Yanyan. Additionally, it includes MidJourney and Stable Diffusion AI drawing features. The system offers a complete open-source solution with ready-to-use frontend and backend applications, providing a seamless typing experience via Websocket. It comes with various pre-trained role applications such as Xiaohongshu writer, English translation master, Socrates, Confucius, Steve Jobs, and weekly report assistant to meet various chat and application needs. Users can enjoy features like Suno Wensheng music, integration with MidJourney/Stable Diffusion AI drawing, personal WeChat QR code for payment, built-in Alipay and WeChat payment functions, support for various membership packages and point card purchases, and plugin API integration for developing powerful plugins using large language model functions.
LLMForEverybody
LLMForEverybody is a comprehensive repository covering various aspects of large language models (LLMs) including pre-training, architecture, optimizers, activation functions, attention mechanisms, tokenization, parallel strategies, training frameworks, deployment, fine-tuning, quantization, GPU parallelism, prompt engineering, agent design, RAG architecture, enterprise deployment challenges, evaluation metrics, and current hot topics in the field. It provides detailed explanations, tutorials, and insights into the workings and applications of LLMs, making it a valuable resource for researchers, developers, and enthusiasts interested in understanding and working with large language models.
AI-YinMei
AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.
llm-resource
llm-resource is a comprehensive collection of high-quality resources for Large Language Models (LLM). It covers various aspects of LLM including algorithms, training, fine-tuning, alignment, inference, data engineering, compression, evaluation, prompt engineering, AI frameworks, AI basics, AI infrastructure, AI compilers, LLM application development, LLM operations, AI systems, and practical implementations. The repository aims to gather and share valuable resources related to LLM for the community to benefit from.
apo
AutoPilot Observability (APO) is an out-of-the-box observability platform that provides one-click installation and ready-to-use capabilities. APO's OneAgent supports one-click configuration-free installation of Tracing probes, collects application fault scene logs, infrastructure metrics, network metrics of applications and downstream dependencies, and Kubernetes events. It supports collecting causality metrics based on eBPF implementation. APO integrates OpenTelemetry probes, otel-collector, Jaeger, ClickHouse, and VictoriaMetrics, reducing user configuration work. APO innovatively integrates eBPF technology with the OpenTelemetry ecosystem, significantly reducing data storage volume. It offers guided troubleshooting using eBPF technology to assist users in pinpointing fault causes on a single page.
LLMLanding
LLMLanding is a repository focused on practical implementation of large models, covering topics from theory to practice. It provides a structured learning path for training large models, including specific tasks like training 1B-scale models, exploring SFT, and working on specialized tasks such as code generation, NLP tasks, and domain-specific fine-tuning. The repository emphasizes a dual learning approach: quickly applying existing tools for immediate output benefits and delving into foundational concepts for long-term understanding. It offers detailed resources and pathways for in-depth learning based on individual preferences and goals, combining theory with practical application to avoid overwhelm and ensure sustained learning progress.
how-to-optim-algorithm-in-cuda
This repository documents how to optimize common algorithms based on CUDA. It includes subdirectories with code implementations for specific optimizations. The optimizations cover topics such as compiling PyTorch from source, NVIDIA's reduce optimization, OneFlow's elementwise template, fast atomic add for half data types, upsample nearest2d optimization in OneFlow, optimized indexing in PyTorch, OneFlow's softmax kernel, linear attention optimization, and more. The repository also includes learning resources related to deep learning frameworks, compilers, and optimization techniques.
geekai
GeekAI is an open-source AI assistant solution based on AI large language model API, featuring a complete system with ready-to-use front-end and back-end management, providing a seamless typing experience via Websocket. It integrates various pre-trained character applications like Xiaohongshu writing assistant, English translation master, Socrates, Confucius, Steve Jobs, and weekly report assistant. The tool supports multiple large language models from platforms like OpenAI, Azure, Wenxin Yanyan, Xunfei Xinghuo, and Tsinghua ChatGLM. Additionally, it includes MidJourney and Stable Diffusion AI drawing functionalities for creating various artworks such as text-based images, face swapping, and blending images. Users can utilize personal WeChat QR codes for payment without the need for enterprise payment channels, and the tool offers integrated payment options like Alipay and WeChat Pay with support for multiple membership packages and point card purchases. It also features a plugin API for developing powerful plugins using large language model functions, including built-in plugins for Weibo hot search, today's headlines, morning news, and AI drawing functions.
higress
Higress is an open-source cloud-native API gateway built on the core of Istio and Envoy, based on Alibaba's internal practice of Envoy Gateway. It is designed for AI-native API gateway, serving AI businesses such as Tongyi Qianwen APP, Bailian Big Model API, and Machine Learning PAI platform. Higress provides capabilities to interface with LLM model vendors, AI observability, multi-model load balancing/fallback, AI token flow control, and AI caching. It offers features for AI gateway, Kubernetes Ingress gateway, microservices gateway, and security protection gateway, with advantages in production-level scalability, stream processing, extensibility, and ease of use.
Desktop-Pet-Godot
Godog is an AI desktop pet powered by a large language model and created with Godot. It aims to provide a versatile and rich desktop AI pet that users can customize to create unique pet images and behaviors. The tool is lightweight, easy to develop with Godot, compatible with various large language models, offers pre-made character functions and multiple appearances, supports multimodal capabilities, and allows users to easily build their own AI desktop pets on top of the existing features.
Avalonia-Assistant
Avalonia-Assistant is an open-source desktop intelligent assistant that aims to provide a user-friendly interactive experience based on the Avalonia UI framework and the integration of Semantic Kernel with OpenAI or other large LLM models. By utilizing Avalonia-Assistant, you can perform various desktop operations through text or voice commands, enhancing your productivity and daily office experience.
aituber-kit
AITuber-Kit is a tool that enables users to interact with AI characters, conduct AITuber live streams, and engage in external integration modes. Users can easily converse with AI characters using various LLM APIs, stream on YouTube with AI character reactions, and send messages to server apps via WebSocket. The tool provides settings for API keys, character configurations, voice synthesis engines, and more. It supports multiple languages and allows customization of VRM models and background images. AITuber-Kit follows the MIT license and offers guidelines for adding new languages to the project.
anylabeling
AnyLabeling is a tool for effortless data labeling with AI support from YOLO and Segment Anything. It combines features from LabelImg and Labelme with an improved UI and auto-labeling capabilities. Users can annotate images with polygons, rectangles, circles, lines, and points, as well as perform auto-labeling using YOLOv5 and Segment Anything. The tool also supports text detection, recognition, and Key Information Extraction (KIE) labeling, with multiple language options available such as English, Vietnamese, and Chinese.
For similar tasks
Stable-Diffusion
Stable Diffusion is a text-to-image AI model that can generate realistic images from a given text prompt. It is a powerful tool that can be used for a variety of creative and practical applications, such as generating concept art, creating illustrations, and designing products. Stable Diffusion is also a great tool for learning about AI and machine learning. This repository contains a collection of tutorials and resources on how to use Stable Diffusion.
awesome-generative-ai
Awesome Generative AI is a curated list of modern Generative Artificial Intelligence projects and services. Generative AI technology creates original content like images, sounds, and texts using machine learning algorithms trained on large data sets. It can produce unique and realistic outputs such as photorealistic images, digital art, music, and writing. The repo covers a wide range of applications in art, entertainment, marketing, academia, and computer science.
latentbox
Latent Box is a curated collection of resources for AI, creativity, and art. It aims to bridge the information gap with high-quality content, promote diversity and interdisciplinary collaboration, and maintain updates through community co-creation. The website features a wide range of resources, including articles, tutorials, tools, and datasets, covering various topics such as machine learning, computer vision, natural language processing, generative art, and creative coding.
For similar jobs
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.
spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.
Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.