stability-sdk
SDK for interacting with stability.ai APIs (e.g. stable diffusion inference)
Stars: 2413
The stability-sdk is a Python package that provides a client implementation for interacting with the Stability API. This API allows users to generate images, upscale images, and animate images using a variety of different models and settings. The stability-sdk makes it easy to use the Stability API from Python code, and it provides a number of helpful features such as command line usage, support for multiple models, and the ability to filter artifacts by type.
README:
Client implementations that interact with the Stability API.
Follow the instructions on Platform to obtain an API key.
Install the PyPI package via:
pip install stability-sdk
client.py
is both a command line client and an API class that wraps the gRPC based API. To try the client:
- Use Python venv:
python3 -m venv pyenv
- Set up in venv dependencies:
pyenv/bin/pip3 install -e .
-
pyenv/bin/activate
to use the venv. - Set the
STABILITY_HOST
environment variable. This is by default set to the production endpointgrpc.stability.ai:443
. - Set the
STABILITY_KEY
environment variable.
Then to invoke:
python3 -m stability_sdk generate -W 1024 -H 1024 "A stunning house."
It will generate and put PNGs in your current directory.
To upscale:
python3 -m stability_sdk upscale -i "/path/to/image.png"
Install with
pip install stability-sdk[anim_ui]
Then run with
python3 -m stability_sdk animate --gui
Be sure to check out Platform for comprehensive documentation on how to interact with our API.
usage: python -m stability_sdk generate [-h] [--height HEIGHT] [--width WIDTH]
[--start_schedule START_SCHEDULE] [--end_schedule END_SCHEDULE]
[--cfg_scale CFG_SCALE] [--sampler SAMPLER] [--steps STEPS]
[--style_preset STYLE_PRESET] [--seed SEED] [--prefix PREFIX] [--engine ENGINE]
[--num_samples NUM_SAMPLES] [--artifact_types ARTIFACT_TYPES]
[--no-store] [--show] [--init_image INIT_IMAGE] [--mask_image MASK_IMAGE]
[prompt ...]
positional arguments:
prompt
options:
-h, --help show this help message and exit
--height HEIGHT, -H HEIGHT
[1024] height of image
--width WIDTH, -W WIDTH
[1024] width of image
--start_schedule START_SCHEDULE
[0.5] start schedule for init image (must be greater than 0; 1 is full strength
text prompt, no trace of image)
--end_schedule END_SCHEDULE
[0.01] end schedule for init image
--cfg_scale CFG_SCALE, -C CFG_SCALE
[7.0] CFG scale factor
--sampler SAMPLER, -A SAMPLER
[auto-select] (ddim, plms, k_euler, k_euler_ancestral, k_heun, k_dpm_2,
k_dpm_2_ancestral, k_lms, k_dpmpp_2m, k_dpmpp_2s_ancestral)
--steps STEPS, -s STEPS
[auto] number of steps
--style_preset STYLE_PRESET
[none] (3d-model, analog-film, anime, cinematic, comic-book, digital-art, enhance,
fantasy-art, isometric, line-art, low-poly, modeling-compound, neon-punk, origami,
photographic, pixel-art, tile-texture)
--seed SEED, -S SEED random seed to use
--prefix PREFIX, -p PREFIX
output prefixes for artifacts
--artifact_types ARTIFACT_TYPES, -t ARTIFACT_TYPES
filter artifacts by type (ARTIFACT_IMAGE, ARTIFACT_TEXT, ARTIFACT_CLASSIFICATIONS, etc)
--no-store do not write out artifacts
--num_samples NUM_SAMPLES, -n NUM_SAMPLES
number of samples to generate
--show open artifacts using PIL
--engine ENGINE, -e ENGINE
engine to use for inference
--init_image INIT_IMAGE, -i INIT_IMAGE
Init image
--mask_image MASK_IMAGE, -m MASK_IMAGE
Mask image
For upscale:
usage: client.py upscale
[-h]
--init_image INIT_IMAGE
[--height HEIGHT] [--width WIDTH] [--prefix PREFIX] [--artifact_types ARTIFACT_TYPES]
[--no-store] [--show] [--engine ENGINE]
positional arguments:
prompt (ignored in esrgan engines)
options:
-h, --help show this help message and exit
--init_image INIT_IMAGE, -i INIT_IMAGE
Init image
--height HEIGHT, -H HEIGHT
height of upscaled image in pixels
--width WIDTH, -W WIDTH
width of upscaled image in pixels
--steps STEPS, -s STEPS
[auto] number of steps (ignored in esrgan engines)
--seed SEED, -S SEED random seed to use (ignored in esrgan engines)
--cfg_scale CFG_SCALE, -C CFG_SCALE
[7.0] CFG scale factor (ignored in esrgan engines)
--prefix PREFIX, -p PREFIX
output prefixes for artifacts
--artifact_types ARTIFACT_TYPES, -t ARTIFACT_TYPES
filter artifacts by type (ARTIFACT_IMAGE, ARTIFACT_TEXT, ARTIFACT_CLASSIFICATIONS, etc)
--no-store do not write out artifacts
--show open artifacts using PIL
--engine ENGINE, -e ENGINE
engine to use for upscale
If a language you would like to connect to the API with is not currently documented on Platform you can use the following protobuf definition to compile stubs for your language:
- Typescript client: https://github.com/jakiestfu/stability-ts
- Guide to building for Ruby: https://github.com/kmcphillips/stability-sdk/blob/main/src/ruby/README.md
- C# client: https://github.com/Katarzyna-Kadziolka/StabilityClient.Net
Usage of the Stability API falls under the STABILITY AI API Terms of Service.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for stability-sdk
Similar Open Source Tools
stability-sdk
The stability-sdk is a Python package that provides a client implementation for interacting with the Stability API. This API allows users to generate images, upscale images, and animate images using a variety of different models and settings. The stability-sdk makes it easy to use the Stability API from Python code, and it provides a number of helpful features such as command line usage, support for multiple models, and the ability to filter artifacts by type.
instructor_ex
Instructor is a tool designed to structure outputs from OpenAI and other OSS LLMs by coaxing them to return JSON that maps to a provided Ecto schema. It allows for defining validation logic to guide LLMs in making corrections, and supports automatic retries. Instructor is primarily used with the OpenAI API but can be extended to work with other platforms. The tool simplifies usage by creating an ecto schema, defining a validation function, and making calls to chat_completion with instructions for the LLM. It also offers features like max_retries to fix validation errors iteratively.
Rodel.Agent
Rodel Agent is a Windows desktop application that integrates chat, text-to-image, text-to-speech, and machine translation services, providing users with a comprehensive desktop AI experience. The application supports mainstream AI services and aims to enhance user interaction through various AI functionalities.
gorilla
Gorilla is a tool that enables LLMs to use tools by invoking APIs. Given a natural language query, Gorilla comes up with the semantically- and syntactically- correct API to invoke. With Gorilla, you can use LLMs to invoke 1,600+ (and growing) API calls accurately while reducing hallucination. Gorilla also releases APIBench, the largest collection of APIs, curated and easy to be trained on!
ChatDev
ChatDev is a virtual software company powered by intelligent agents like CEO, CPO, CTO, programmer, reviewer, tester, and art designer. These agents collaborate to revolutionize the digital world through programming. The platform offers an easy-to-use, highly customizable, and extendable framework based on large language models, ideal for studying collective intelligence. ChatDev introduces innovative methods like Iterative Experience Refinement and Experiential Co-Learning to enhance software development efficiency. It supports features like incremental development, Docker integration, Git mode, and Human-Agent-Interaction mode. Users can customize ChatChain, Phase, and Role settings, and share their software creations easily. The project is open-source under the Apache 2.0 License and utilizes data licensed under CC BY-NC 4.0.
Free-GPT4-WEB-API
FreeGPT4-WEB-API is a Python server that allows you to have a self-hosted GPT-4 Unlimited and Free WEB API, via the latest Bing's AI. It uses Flask and GPT4Free libraries. GPT4Free provides an interface to the Bing's GPT-4. The server can be configured by editing the `FreeGPT4_Server.py` file. You can change the server's port, host, and other settings. The only cookie needed for the Bing model is `_U`.
Endia
Endia is a dynamic Array library for Scientific Computing, offering automatic differentiation of arbitrary order, complex number support, dual API with PyTorch-like imperative or JAX-like functional interface, and JIT Compilation for speeding up training and inference. It can handle complex valued functions, perform both forward and reverse-mode automatic differentiation, and has a builtin JIT compiler. Endia aims to advance AI & Scientific Computing by pushing boundaries with clear algorithms, providing high-performance open-source code that remains readable and pythonic, and prioritizing clarity and educational value over exhaustive features.
aiogram_bot_template
This repository provides a template for easy asynchronous bot development for Telegram using Aiogram library. It includes setup instructions using Docker Compose, migration commands for TortoiseORM, and examples for working with TortoiseORM. The template aims to simplify the process of creating Telegram bots with features like FastAPI integration and Docker Compose setup. The repository also includes a list of tasks to be completed, such as changing SQLAlchemy to TortoiseORM, adding FastAPI for webhook, and setting up CI/CD examples.
langport
LangPort is an open-source platform for serving large language models. It aims to provide a super fast LLM inference service with core features including Huggingface transformers support, distributed serving system, streaming generation, batch inference, and support for various model architectures. It offers compatibility with OpenAI, FauxPilot, HuggingFace, and Tabby APIs. The project supports model architectures like LLaMa, GLM, GPT2, and GPT Neo, and has been tested with models such as NingYu, Vicuna, ChatGLM, and WizardLM. LangPort also provides features like dynamic batch inference, int4 quantization, and generation logprobs parameter.
InternGPT
InternGPT (iGPT) is a pointing-language-driven visual interactive system that enhances communication between users and chatbots by incorporating pointing instructions. It improves chatbot accuracy in vision-centric tasks, especially in complex visual scenarios. The system includes an auxiliary control mechanism to enhance the control capability of the language model. InternGPT features a large vision-language model called Husky, fine-tuned for high-quality multi-modal dialogue. Users can interact with ChatGPT by clicking, dragging, and drawing using a pointing device, leading to efficient communication and improved chatbot performance in vision-related tasks.
skyrim
Skyrim is a weather forecasting tool that enables users to run large weather models using consumer-grade GPUs. It provides access to state-of-the-art foundational weather models through a well-maintained infrastructure. Users can forecast weather conditions, such as wind speed and direction, by running simulations on their own GPUs or using modal volume or cloud services like s3 buckets. Skyrim supports various large weather models like Graphcast, Pangu, Fourcastnet, and DLWP, with plans for future enhancements like ensemble prediction and model quantization.
ai-renamer
ai-renamer is a Node.js CLI tool that intelligently renames files in a specified directory using Ollama models like Llama, Gemma, Phi, etc. It allows users to set case style, model, maximum characters in the filename, and output language. The tool utilizes the change-case library for case styling and requires Ollama and at least one LLM to be installed on the system. Users can contribute by opening new issues or making pull requests. Licensed under GPL-3.0.
rosa
ROSA is an AI Agent designed to interact with ROS-based robotics systems using natural language queries. It can generate system reports, read and parse ROS log files, adapt to new robots, and run various ROS commands using natural language. The tool is versatile for robotics research and development, providing an easy way to interact with robots and the ROS environment.
embodied-agents
Embodied Agents is a toolkit for integrating large multi-modal models into existing robot stacks with just a few lines of code. It provides consistency, reliability, scalability, and is configurable to any observation and action space. The toolkit is designed to reduce complexities involved in setting up inference endpoints, converting between different model formats, and collecting/storing datasets. It aims to facilitate data collection and sharing among roboticists by providing Python-first abstractions that are modular, extensible, and applicable to a wide range of tasks. The toolkit supports asynchronous and remote thread-safe agent execution for maximal responsiveness and scalability, and is compatible with various APIs like HuggingFace Spaces, Datasets, Gymnasium Spaces, Ollama, and OpenAI. It also offers automatic dataset recording and optional uploads to the HuggingFace hub.
SUPIR
SUPIR is an AI-based image processing and upscaling tool that leverages cutting-edge technology to enhance image quality and resolution. The tool provides users with the ability to upscale images with high generalization and quality, as well as specific settings for light degradation scenarios. It offers a range of models and checkpoints for different use cases, along with detailed instructions for installation and usage. SUPIR also includes features for color fixing, linear CFG adjustments, and various prompts for image enhancement. The tool is designed for non-commercial use only and comes with a contact email for inquiries and permission requests for commercial use.
AIOS
AIOS, a Large Language Model (LLM) Agent operating system, embeds large language model into Operating Systems (OS) as the brain of the OS, enabling an operating system "with soul" -- an important step towards AGI. AIOS is designed to optimize resource allocation, facilitate context switch across agents, enable concurrent execution of agents, provide tool service for agents, maintain access control for agents, and provide a rich set of toolkits for LLM Agent developers.
For similar tasks
stability-sdk
The stability-sdk is a Python package that provides a client implementation for interacting with the Stability API. This API allows users to generate images, upscale images, and animate images using a variety of different models and settings. The stability-sdk makes it easy to use the Stability API from Python code, and it provides a number of helpful features such as command line usage, support for multiple models, and the ability to filter artifacts by type.
ap-plugin
AP-PLUGIN is an AI drawing plugin for the Yunzai series robot framework, allowing you to have a convenient AI drawing experience in the input box. It uses the open source Stable Diffusion web UI as the backend, deploys it for free, and generates a variety of images with richer functions.
fabric
Fabric is an open-source framework for augmenting humans using AI. It provides a structured approach to breaking down problems into individual components and applying AI to them one at a time. Fabric includes a collection of pre-defined Patterns (prompts) that can be used for a variety of tasks, such as extracting the most interesting parts of YouTube videos and podcasts, writing essays, summarizing academic papers, creating AI art prompts, and more. Users can also create their own custom Patterns. Fabric is designed to be easy to use, with a command-line interface and a variety of helper apps. It is also extensible, allowing users to integrate it with their own AI applications and infrastructure.
comflowy
Comflowy is a community dedicated to providing comprehensive tutorials, fostering discussions, and building a database of workflows and models for ComfyUI and Stable Diffusion. Our mission is to lower the entry barrier for ComfyUI users, promote its mainstream adoption, and contribute to the growth of the AI generative graphics community.
Building-AI-Applications-with-ChatGPT-APIs
This repository is for the book 'Building AI Applications with ChatGPT APIs' published by Packt. It provides code examples and instructions for mastering ChatGPT, Whisper, and DALL-E APIs through building innovative AI projects. Readers will learn to develop AI applications using ChatGPT APIs, integrate them with frameworks like Flask and Django, create AI-generated art with DALL-E APIs, and optimize ChatGPT models through fine-tuning.
comfyui-photoshop
ComfyUI for Photoshop is a plugin that integrates with an AI-powered image generation system to enhance the Photoshop experience with features like unlimited generative fill, customizable back-end, AI-powered artistry, and one-click transformation. The plugin requires a minimum of 6GB graphics memory and 12GB RAM. Users can install the plugin and set up the ComfyUI workflow using provided links and files. Additionally, specific files like Check points, Loras, and Detailer Lora are required for different functionalities. Support and contributions are encouraged through GitHub.
awesome-generative-ai
Awesome Generative AI is a curated list of modern Generative Artificial Intelligence projects and services. Generative AI technology creates original content like images, sounds, and texts using machine learning algorithms trained on large data sets. It can produce unique and realistic outputs such as photorealistic images, digital art, music, and writing. The repo covers a wide range of applications in art, entertainment, marketing, academia, and computer science.
painting-droid
Painting Droid is an AI-powered cross-platform painting app inspired by MS Paint, expandable with plugins and open. It utilizes various AI models, from paid providers to self-hosted open-source models, as well as some lightweight ones built into the app. Features include regular painting app features, AI-generated content filling and augmentation, filters and effects, image manipulation, plugin support, and cross-platform compatibility.
For similar jobs
ap-plugin
AP-PLUGIN is an AI drawing plugin for the Yunzai series robot framework, allowing you to have a convenient AI drawing experience in the input box. It uses the open source Stable Diffusion web UI as the backend, deploys it for free, and generates a variety of images with richer functions.
99AI
99AI is a commercializable AI web application based on NineAI 2.4.2 (no authorization, no backdoors, no piracy, integrated front-end and back-end integration packages, supports Docker rapid deployment). The uncompiled source code is temporarily closed. Compared with the stable version, the development version is faster.
midjourney-proxy
Midjourney-proxy is a proxy for the Discord channel of MidJourney, enabling API-based calls for AI drawing. It supports Imagine instructions, adding image base64 as a placeholder, Blend and Describe commands, real-time progress tracking, Chinese prompt translation, prompt sensitive word pre-detection, user-token connection to WSS, multi-account configuration, and more. For more advanced features, consider using midjourney-proxy-plus, which includes Shorten, focus shifting, image zooming, local redrawing, nearly all associated button actions, Remix mode, seed value retrieval, account pool persistence, dynamic maintenance, /info and /settings retrieval, account settings configuration, Niji bot robot, InsightFace face replacement robot, and an embedded management dashboard.
comflowyspace
Comflowyspace is an open-source AI image and video generation tool that aims to provide a more user-friendly and accessible experience than existing tools like SDWebUI and ComfyUI. It simplifies the installation, usage, and workflow management of AI image and video generation, making it easier for users to create and explore AI-generated content. Comflowyspace offers features such as one-click installation, workflow management, multi-tab functionality, workflow templates, and an improved user interface. It also provides tutorials and documentation to lower the learning curve for users. The tool is designed to make AI image and video generation more accessible and enjoyable for a wider range of users.
comflowy
Comflowy is a community dedicated to providing comprehensive tutorials, fostering discussions, and building a database of workflows and models for ComfyUI and Stable Diffusion. Our mission is to lower the entry barrier for ComfyUI users, promote its mainstream adoption, and contribute to the growth of the AI generative graphics community.
stability-sdk
The stability-sdk is a Python package that provides a client implementation for interacting with the Stability API. This API allows users to generate images, upscale images, and animate images using a variety of different models and settings. The stability-sdk makes it easy to use the Stability API from Python code, and it provides a number of helpful features such as command line usage, support for multiple models, and the ability to filter artifacts by type.
awesome-generative-ai
A curated list of Generative AI projects, tools, artworks, and models
comfyui_fk_server
This is an ideal Comfyui translation plugin that allows any long text input box in Comfyui to support Chinese input and automatic translation (using Baidu translation). It also includes error correction translation feature and keyword polishing feature for generating professional AI drawing prompts (using Zhipu AI big model). Additionally, it provides a one-click fix feature for correcting model references in workflows, greatly improving workflow model call correction efficiency (based on model name matching). The plugin requires Baidu translation API key for translation functionality and Zhipu AI API key for keyword polishing functionality. After installation, users can enable automatic translation mode and keyword polishing feature by double-clicking any long text input box in Comfyui.