stability-sdk

SDK for interacting with stability.ai APIs (e.g. stable diffusion inference)

Stars: 2413

Visit

The stability-sdk is a Python package that provides a client implementation for interacting with the Stability API. This API allows users to generate images, upscale images, and animate images using a variety of different models and settings. The stability-sdk makes it easy to use the Stability API from Python code, and it provides a number of helpful features such as command line usage, support for multiple models, and the ability to filter artifacts by type.

README:

stability-sdk

Client implementations that interact with the Stability API.

Getting an API key

Follow the instructions on Platform to obtain an API key.

PyPI Package Installation

Install the PyPI package via:

pip install stability-sdk

Python Client

client.py is both a command line client and an API class that wraps the gRPC based API. To try the client:

Use Python venv: python3 -m venv pyenv
Set up in venv dependencies: pyenv/bin/pip3 install -e .
pyenv/bin/activate to use the venv.
Set the STABILITY_HOST environment variable. This is by default set to the production endpoint grpc.stability.ai:443.
Set the STABILITY_KEY environment variable.

Then to invoke:

python3 -m stability_sdk generate -W 1024 -H 1024 "A stunning house."

It will generate and put PNGs in your current directory.

To upscale: python3 -m stability_sdk upscale -i "/path/to/image.png"

Animation UI

Install with pip install stability-sdk[anim_ui]

Then run with python3 -m stability_sdk animate --gui

SDK Usage

Be sure to check out Platform for comprehensive documentation on how to interact with our API.

Command line usage

usage: python -m stability_sdk generate [-h] [--height HEIGHT] [--width WIDTH] 
                [--start_schedule START_SCHEDULE] [--end_schedule END_SCHEDULE] 
                [--cfg_scale CFG_SCALE] [--sampler SAMPLER] [--steps STEPS] 
                [--style_preset STYLE_PRESET] [--seed SEED] [--prefix PREFIX] [--engine ENGINE]
                [--num_samples NUM_SAMPLES] [--artifact_types ARTIFACT_TYPES]
                [--no-store] [--show] [--init_image INIT_IMAGE] [--mask_image MASK_IMAGE]
                [prompt ...]

positional arguments:
  prompt

options:
  -h, --help            show this help message and exit
  --height HEIGHT, -H HEIGHT
                        [1024] height of image
  --width WIDTH, -W WIDTH
                        [1024] width of image
  --start_schedule START_SCHEDULE
                        [0.5] start schedule for init image (must be greater than 0; 1 is full strength
                        text prompt, no trace of image)
  --end_schedule END_SCHEDULE
                        [0.01] end schedule for init image
  --cfg_scale CFG_SCALE, -C CFG_SCALE
                        [7.0] CFG scale factor
  --sampler SAMPLER, -A SAMPLER
                        [auto-select] (ddim, plms, k_euler, k_euler_ancestral, k_heun, k_dpm_2,
                        k_dpm_2_ancestral, k_lms, k_dpmpp_2m, k_dpmpp_2s_ancestral)
  --steps STEPS, -s STEPS
                        [auto] number of steps
  --style_preset STYLE_PRESET
                        [none] (3d-model, analog-film, anime, cinematic, comic-book, digital-art, enhance, 
                        fantasy-art, isometric, line-art, low-poly, modeling-compound, neon-punk, origami, 
                        photographic, pixel-art, tile-texture)
  --seed SEED, -S SEED  random seed to use
  --prefix PREFIX, -p PREFIX
                        output prefixes for artifacts
  --artifact_types ARTIFACT_TYPES, -t ARTIFACT_TYPES
                        filter artifacts by type (ARTIFACT_IMAGE, ARTIFACT_TEXT, ARTIFACT_CLASSIFICATIONS, etc)
  --no-store            do not write out artifacts
  --num_samples NUM_SAMPLES, -n NUM_SAMPLES
                        number of samples to generate
  --show                open artifacts using PIL
  --engine ENGINE, -e ENGINE
                        engine to use for inference
  --init_image INIT_IMAGE, -i INIT_IMAGE
                        Init image
  --mask_image MASK_IMAGE, -m MASK_IMAGE
                        Mask image

For upscale:

usage: client.py upscale
       [-h]
       --init_image INIT_IMAGE
       [--height HEIGHT] [--width WIDTH] [--prefix PREFIX] [--artifact_types ARTIFACT_TYPES]
       [--no-store] [--show] [--engine ENGINE]

positional arguments:
  prompt (ignored in esrgan engines)

options:
  -h, --help            show this help message and exit
  --init_image INIT_IMAGE, -i INIT_IMAGE
                        Init image
  --height HEIGHT, -H HEIGHT
                        height of upscaled image in pixels
  --width WIDTH, -W WIDTH
                        width of upscaled image in pixels
  --steps STEPS, -s STEPS
                        [auto] number of steps (ignored in esrgan engines)
  --seed SEED, -S SEED  random seed to use (ignored in esrgan engines)
  --cfg_scale CFG_SCALE, -C CFG_SCALE
                        [7.0] CFG scale factor (ignored in esrgan engines)
  --prefix PREFIX, -p PREFIX
                        output prefixes for artifacts
  --artifact_types ARTIFACT_TYPES, -t ARTIFACT_TYPES
                        filter artifacts by type (ARTIFACT_IMAGE, ARTIFACT_TEXT, ARTIFACT_CLASSIFICATIONS, etc)
  --no-store            do not write out artifacts
  --show                open artifacts using PIL
  --engine ENGINE, -e ENGINE
                        engine to use for upscale

Connecting to the API using languages other than Python

If a language you would like to connect to the API with is not currently documented on Platform you can use the following protobuf definition to compile stubs for your language:

protobuf spec

Community-contributed clients

Typescript client: https://github.com/jakiestfu/stability-ts
Guide to building for Ruby: https://github.com/kmcphillips/stability-sdk/blob/main/src/ruby/README.md
C# client: https://github.com/Katarzyna-Kadziolka/StabilityClient.Net

Stability API TOS

Usage of the Stability API falls under the STABILITY AI API Terms of Service.

For Tasks:

Click tags to check more tools for each tasks

generate images upscale images animate images create art make videos

For Jobs:

image generation image upscaling image animation ai artist content creator

Alternative AI tools for stability-sdk

Similar Open Source Tools

stability-sdk

github

: 2.4k

instructor_ex

Instructor is a tool designed to structure outputs from OpenAI and other OSS LLMs by coaxing them to return JSON that maps to a provided Ecto schema. It allows for defining validation logic to guide LLMs in making corrections, and supports automatic retries. Instructor is primarily used with the OpenAI API but can be extended to work with other platforms. The tool simplifies usage by creating an ecto schema, defining a validation function, and making calls to chat_completion with instructions for the LLM. It also offers features like max_retries to fix validation errors iteratively.

github

: 391

openai-kit

OpenAIKit is a Swift package designed to facilitate communication with the OpenAI API. It provides methods to interact with various OpenAI services such as chat, models, completions, edits, images, embeddings, files, moderations, and speech to text. The package encourages the use of environment variables to securely inject the OpenAI API key and organization details. It also offers error handling for API requests through the `OpenAIKit.APIErrorResponse`.

github

: 688

driverlessai-recipes

This repository contains custom recipes for H2O Driverless AI, which is an Automatic Machine Learning platform for the Enterprise. Custom recipes are Python code snippets that can be uploaded into Driverless AI at runtime to automate feature engineering, model building, visualization, and interpretability. Users can gain control over the optimization choices made by Driverless AI by providing their own custom recipes. The repository includes recipes for various tasks such as data manipulation, data preprocessing, feature selection, data augmentation, model building, scoring, and more. Best practices for creating and using recipes are also provided, including security considerations, performance tips, and safety measures.

github

: 246

llmfarm_core.swift

LLMFarm_core.swift is a Swift library designed to work with large language models (LLM). It enables users to load different LLMs with specific parameters. The library supports MacOS (13+) and iOS (16+), offering various inferences and sampling methods. It includes features such as Metal support (not compatible with Intel Mac), model setting templates, LoRA adapters support, and LoRA train support. The library is based on ggml and llama.cpp by Georgi Gerganov, with additional sources from rwkv.cpp by saharNooby and Mia by byroneverson.

github

: 241

StepWise

StepWise is a code-first, event-driven workflow framework for .NET designed to help users build complex workflows in a simple and efficient way. It allows users to define workflows using C# code, visualize and execute workflows from a browser, execute steps in parallel, and resolve dependencies automatically. StepWise also features an AI assistant called `Geeno` in its WebUI to help users run and analyze workflows with ease.

github

: 111

refact-lsp

Refact Agent is a small executable written in Rust as part of the Refact Agent project. It lives inside your IDE to keep AST and VecDB indexes up to date, supporting connection graphs between definitions and usages in popular programming languages. It functions as an LSP server, offering code completion, chat functionality, and integration with various tools like browsers, databases, and debuggers. Users can interact with it through a Text UI in the command line.

github

: 57

Endia

Endia is a dynamic Array library for Scientific Computing, offering automatic differentiation of arbitrary order, complex number support, dual API with PyTorch-like imperative or JAX-like functional interface, and JIT Compilation for speeding up training and inference. It can handle complex valued functions, perform both forward and reverse-mode automatic differentiation, and has a builtin JIT compiler. Endia aims to advance AI & Scientific Computing by pushing boundaries with clear algorithms, providing high-performance open-source code that remains readable and pythonic, and prioritizing clarity and educational value over exhaustive features.

github

: 190

lmnr

Laminar is an all-in-one open-source platform designed for engineering AI products. It allows users to trace, evaluate, label, and analyze LLM data efficiently. The platform offers features such as automatic tracing of common AI frameworks and SDKs, local and online evaluations, simple UI for data labeling, dataset management, and scalability with gRPC communication. Laminar is built with a modern open-source stack including RabbitMQ, Postgres, Clickhouse, and Qdrant for semantic similarity search. It provides fast and beautiful dashboards for traces, evaluations, and labels, making it a comprehensive tool for AI product development.

github

: 1.8k

docs-ai

Docs AI is a platform that allows users to train their documents, chat with their documents, and create chatbots to solve queries. It is built using NextJS, Tailwind, tRPC, ShadcnUI, Prisma, Postgres, NextAuth, Pinecone, and Cloudflare R2. The platform requires Node.js (Version: >=18.x), PostgreSQL, and Redis for setup. Users can utilize Docker for development by using the provided `docker-compose.yml` file in the `/app` directory.

github

: 85

embodied-agents

Embodied Agents is a toolkit for integrating large multi-modal models into existing robot stacks with just a few lines of code. It provides consistency, reliability, scalability, and is configurable to any observation and action space. The toolkit is designed to reduce complexities involved in setting up inference endpoints, converting between different model formats, and collecting/storing datasets. It aims to facilitate data collection and sharing among roboticists by providing Python-first abstractions that are modular, extensible, and applicable to a wide range of tasks. The toolkit supports asynchronous and remote thread-safe agent execution for maximal responsiveness and scalability, and is compatible with various APIs like HuggingFace Spaces, Datasets, Gymnasium Spaces, Ollama, and OpenAI. It also offers automatic dataset recording and optional uploads to the HuggingFace hub.

github

: 158

amplication

Amplication is a robust, open-source development platform designed to revolutionize the creation of scalable and secure .NET and Node.js applications. It automates backend applications development, ensuring consistency, predictability, and adherence to the highest standards with code that's built to scale. The user-friendly interface fosters seamless integration of APIs, data models, databases, authentication, and authorization. Built on a flexible, plugin-based architecture, Amplication allows effortless customization of the code and offers a diverse range of integrations. With a strong focus on collaboration, Amplication streamlines team-oriented development, making it an ideal choice for groups of all sizes, from startups to large enterprises. It enables users to concentrate on business logic while handling the heavy lifting of development. Experience the fastest way to develop .NET and Node.js applications with Amplication.

github

: 15.5k

LLMRec

LLMRec is a PyTorch implementation for the WSDM 2024 paper 'Large Language Models with Graph Augmentation for Recommendation'. It is a novel framework that enhances recommenders by applying LLM-based graph augmentation strategies to recommendation systems. The tool aims to make the most of content within online platforms to augment interaction graphs by reinforcing u-i interactive edges, enhancing item node attributes, and conducting user node profiling from a natural language perspective.

github

: 269

CodeFuse-ModelCache

Codefuse-ModelCache is a semantic cache for large language models (LLMs) that aims to optimize services by introducing a caching mechanism. It helps reduce the cost of inference deployment, improve model performance and efficiency, and provide scalable services for large models. The project caches pre-generated model results to reduce response time for similar requests and enhance user experience. It integrates various embedding frameworks and local storage options, offering functionalities like cache-writing, cache-querying, and cache-clearing through RESTful API. The tool supports multi-tenancy, system commands, and multi-turn dialogue, with features for data isolation, database management, and model loading schemes. Future developments include data isolation based on hyperparameters, enhanced system prompt partitioning storage, and more versatile embedding models and similarity evaluation algorithms.

github

: 626

aiscript

AIScript is a unique programming language and web framework written in Rust, designed to help developers effortlessly build AI applications. It combines the strengths of Python, JavaScript, and Rust to create an intuitive, powerful, and easy-to-use tool. The language features first-class functions, built-in AI primitives, dynamic typing with static type checking, data validation, error handling inspired by Rust, a rich standard library, and automatic garbage collection. The web framework offers an elegant route DSL, automatic parameter validation, OpenAPI schema generation, database modules, authentication capabilities, and more. AIScript excels in AI-powered APIs, prototyping, microservices, data validation, and building internal tools.

github

: 255

SUPIR

SUPIR is an AI-based image processing and upscaling tool that leverages cutting-edge technology to enhance image quality and resolution. The tool provides users with the ability to upscale images with high generalization and quality, as well as specific settings for light degradation scenarios. It offers a range of models and checkpoints for different use cases, along with detailed instructions for installation and usage. SUPIR also includes features for color fixing, linear CFG adjustments, and various prompts for image enhancement. The tool is designed for non-commercial use only and comes with a contact email for inquiries and permission requests for commercial use.

github

: 4.0k

For similar tasks

stability-sdk

github

: 2.4k

ap-plugin

AP-PLUGIN is an AI drawing plugin for the Yunzai series robot framework, allowing you to have a convenient AI drawing experience in the input box. It uses the open source Stable Diffusion web UI as the backend, deploys it for free, and generates a variety of images with richer functions.

github

: 103

fabric

Fabric is an open-source framework for augmenting humans using AI. It provides a structured approach to breaking down problems into individual components and applying AI to them one at a time. Fabric includes a collection of pre-defined Patterns (prompts) that can be used for a variety of tasks, such as extracting the most interesting parts of YouTube videos and podcasts, writing essays, summarizing academic papers, creating AI art prompts, and more. Users can also create their own custom Patterns. Fabric is designed to be easy to use, with a command-line interface and a variety of helper apps. It is also extensible, allowing users to integrate it with their own AI applications and infrastructure.

github

: 30.3k

comflowy

Comflowy is a community dedicated to providing comprehensive tutorials, fostering discussions, and building a database of workflows and models for ComfyUI and Stable Diffusion. Our mission is to lower the entry barrier for ComfyUI users, promote its mainstream adoption, and contribute to the growth of the AI generative graphics community.

github

: 1.0k

Building-AI-Applications-with-ChatGPT-APIs

This repository is for the book 'Building AI Applications with ChatGPT APIs' published by Packt. It provides code examples and instructions for mastering ChatGPT, Whisper, and DALL-E APIs through building innovative AI projects. Readers will learn to develop AI applications using ChatGPT APIs, integrate them with frameworks like Flask and Django, create AI-generated art with DALL-E APIs, and optimize ChatGPT models through fine-tuning.

github

: 54

comfyui-photoshop

ComfyUI for Photoshop is a plugin that integrates with an AI-powered image generation system to enhance the Photoshop experience with features like unlimited generative fill, customizable back-end, AI-powered artistry, and one-click transformation. The plugin requires a minimum of 6GB graphics memory and 12GB RAM. Users can install the plugin and set up the ComfyUI workflow using provided links and files. Additionally, specific files like Check points, Loras, and Detailer Lora are required for different functionalities. Support and contributions are encouraged through GitHub.

github

: 481

awesome-generative-ai

Awesome Generative AI is a curated list of modern Generative Artificial Intelligence projects and services. Generative AI technology creates original content like images, sounds, and texts using machine learning algorithms trained on large data sets. It can produce unique and realistic outputs such as photorealistic images, digital art, music, and writing. The repo covers a wide range of applications in art, entertainment, marketing, academia, and computer science.

github

: 7.8k

painting-droid

Painting Droid is an AI-powered cross-platform painting app inspired by MS Paint, expandable with plugins and open. It utilizes various AI models, from paid providers to self-hosted open-source models, as well as some lightweight ones built into the app. Features include regular painting app features, AI-generated content filling and augmentation, filters and effects, image manipulation, plugin support, and cross-platform compatibility.

github

: 134

For similar jobs

ap-plugin

github

: 103

99AI

99AI is a commercializable AI web application based on NineAI 2.4.2 (no authorization, no backdoors, no piracy, integrated front-end and back-end integration packages, supports Docker rapid deployment). The uncompiled source code is temporarily closed. Compared with the stable version, the development version is faster.

github

: 736

midjourney-proxy

Midjourney-proxy is a proxy for the Discord channel of MidJourney, enabling API-based calls for AI drawing. It supports Imagine instructions, adding image base64 as a placeholder, Blend and Describe commands, real-time progress tracking, Chinese prompt translation, prompt sensitive word pre-detection, user-token connection to WSS, multi-account configuration, and more. For more advanced features, consider using midjourney-proxy-plus, which includes Shorten, focus shifting, image zooming, local redrawing, nearly all associated button actions, Remix mode, seed value retrieval, account pool persistence, dynamic maintenance, /info and /settings retrieval, account settings configuration, Niji bot robot, InsightFace face replacement robot, and an embedded management dashboard.

github

: 4.9k

comflowyspace

Comflowyspace is an open-source AI image and video generation tool that aims to provide a more user-friendly and accessible experience than existing tools like SDWebUI and ComfyUI. It simplifies the installation, usage, and workflow management of AI image and video generation, making it easier for users to create and explore AI-generated content. Comflowyspace offers features such as one-click installation, workflow management, multi-tab functionality, workflow templates, and an improved user interface. It also provides tutorials and documentation to lower the learning curve for users. The tool is designed to make AI image and video generation more accessible and enjoyable for a wider range of users.

github

: 1.8k

comflowy

github

: 1.0k

stability-sdk

github

: 2.4k

awesome-generative-ai

A curated list of Generative AI projects, tools, artworks, and models

github

: 2.7k

comfyui_fk_server

This is an ideal Comfyui translation plugin that allows any long text input box in Comfyui to support Chinese input and automatic translation (using Baidu translation). It also includes error correction translation feature and keyword polishing feature for generating professional AI drawing prompts (using Zhipu AI big model). Additionally, it provides a one-click fix feature for correcting model references in workflows, greatly improving workflow model call correction efficiency (based on model name matching). The plugin requires Baidu translation API key for translation functionality and Zhipu AI API key for keyword polishing functionality. After installation, users can enable automatic translation mode and keyword polishing feature by double-clicking any long text input box in Comfyui.

github

: 83