mflux

A MLX port of FLUX based on the Huggingface Diffusers implementation.

Stars: 1295

Visit

MFLUX is a line-by-line port of the FLUX implementation in the Huggingface Diffusers library to Apple MLX. It aims to run powerful FLUX models from Black Forest Labs locally on Mac machines. The codebase is minimal and explicit, prioritizing readability over generality and performance. Models are implemented from scratch in MLX, with tokenizers from the Huggingface Transformers library. Dependencies include Numpy and Pillow for image post-processing. Installation can be done using `uv tool` or classic virtual environment setup. Command-line arguments allow for image generation with specified models, prompts, and optional parameters. Quantization options for speed and memory reduction are available. LoRA adapters can be loaded for fine-tuning image generation. Controlnet support provides more control over image generation with reference images. Current limitations include generating images one by one, lack of support for negative prompts, and some LoRA adapters not working.

README:

A MLX port of FLUX based on the Huggingface Diffusers implementation.

About

Run the powerful FLUX models from Black Forest Labs locally on your Mac!

Philosophy
💿 Installation
🖼️ Generating an image
- 📜 Full list of Command-Line Arguments
⏱️ Image generation speed (updated)
↔️ Equivalent to Diffusers implementation
🗜️ Quantization
- 📊 Size comparisons for quantized models
- 💾 Saving a quantized version to disk
- 💽 Loading and running a quantized version from disk
💽 Running a non-quantized model directly from disk
🌐 Third-Party HuggingFace Model Support
🎨 Image-to-Image
🔌 LoRA
- Multi-LoRA
- Supported LoRA formats (updated)
🎭 In-Context LoRA
- Available Styles
- How It Works
- Tips for Best Results
🛠️ Flux Tools
- 🖌️ Fill
  - Inpainting
  - Outpainting
🎛️ Dreambooth fine-tuning
- Training configuration
- Training example
- Resuming a training run
- Configuration details
- Memory issues
- Misc
🕹️ Controlnet
🚧 Current limitations
💡Workflow tips
✅ TODO
🔬 Cool research / features to support
🌱‍ Related projects
License

Philosophy

MFLUX is a line-by-line port of the FLUX implementation in the Huggingface Diffusers library to Apple MLX. MFLUX is purposefully kept minimal and explicit - Network architectures are hardcoded and no config files are used except for the tokenizers. The aim is to have a tiny codebase with the single purpose of expressing these models (thereby avoiding too many abstractions). While MFLUX priorities readability over generality and performance, it can still be quite fast, and even faster quantized.

All models are implemented from scratch in MLX and only the tokenizers are used via the Huggingface Transformers library. Other than that, there are only minimal dependencies like Numpy and Pillow for simple image post-processing.

💿 Installation

For users, the easiest way to install MFLUX is to use uv tool: If you have installed uv, simply:

uv tool install --upgrade mflux

to get the mflux-generate and related command line executables. You can skip to the usage guides below.

For Python 3.13 dev preview

The T5 encoder is dependent on sentencepiece, which does not have a installable wheel artifact for Python 3.13 as of Nov 2024. Until Google publishes a 3.13 wheel, you need to build your own wheel with official build instructions or for your convenience use a .whl pre-built by contributor @anthonywu. The steps below should work for most developers though your system may vary.

uv venv --python 3.13
python -V  # e.g. Python 3.13.0rc2
source .venv/bin/activate

# for your convenience, you can use the contributor wheel
uv pip install https://github.com/anthonywu/sentencepiece/releases/download/0.2.1-py13dev/sentencepiece-0.2.1-cp313-cp313-macosx_11_0_arm64.whl

# enable the pytorch nightly 
uv pip install --pre --extra-index-url https://download.pytorch.org/whl/nightly -e .

For the classic way to create a user virtual environment:

mkdir -p mflux && cd mflux && python3 -m venv .venv && source .venv/bin/activate

This creates and activates a virtual environment in the mflux folder. After that, install MFLUX via pip:

pip install -U mflux

For contributors (click to expand)

Clone the repo:

 git clone [email protected]:filipstrand/mflux.git

Install the application

 make install

To run the test suite

 make test

Follow format and lint checks prior to submitting Pull Requests. The recommended make lint and make format installs and uses ruff. You can setup your editor/IDE to lint/format automatically, or use our provided make helpers:

make format - formats your code
make lint - shows your lint errors and warnings, but does not auto fix
make check - via pre-commit hooks, formats your code and attempts to auto fix lint errors
consult official ruff documentation on advanced usages

If you have trouble installing MFLUX, please see the installation related issues section.

🖼️ Generating an image

Run the command mflux-generate by specifying a prompt and the model and some optional arguments. For example, here we use a quantized version of the schnell model for 2 steps:

mflux-generate --model schnell --prompt "Luxury food photograph" --steps 2 --seed 2 -q 8

This example uses the more powerful dev model with 25 time steps:

mflux-generate --model dev --prompt "Luxury food photograph" --steps 25 --seed 2 -q 8

⚠️ If the specific model is not already downloaded on your machine, it will start the download process and fetch the model weights (~34GB in size for the Schnell or Dev model respectively). See the quantization section for running compressed versions of the model. ⚠️

By default, model files are downloaded to the .cache folder within your home directory. For example, in my setup, the path looks like this:

/Users/filipstrand/.cache/huggingface/hub/models--black-forest-labs--FLUX.1-dev

To change this default behavior, you can do so by modifying the HF_HOME environment variable. For more details on how to adjust this setting, please refer to the Hugging Face documentation.

🔒 FLUX.1-dev currently requires granted access to its Huggingface repo. For troubleshooting, see the issue tracker 🔒

📜 Full list of Command-Line Arguments

--prompt (required, str): Text description of the image to generate.
--model or -m (required, str): Model to use for generation. Can be one of the official models ("schnell" or "dev") or a HuggingFace repository ID for a compatible third-party model (e.g., "Freepik/flux.1-lite-8B-alpha").
--base-model (optional, str, default: None): Specifies which base architecture a third-party model is derived from ("schnell" or "dev"). Required when using third-party models from HuggingFace.
--output (optional, str, default: "image.png"): Output image filename. If --seed or --auto-seeds establishes multiple seed values, the output filename will automatically be modified to include the seed value (e.g., image_seed_42.png).
--seed (optional, repeatable int args, default: None): 1 or more seeds for random number generation. e.g. --seed 42 or --seed 123 456 789. When multiple seeds are provided, MFLUX will generate one image per seed, using the same prompt and settings. Default is a single time-based value.
--auto-seeds (optional, int, default: None): Auto generate N random Seeds in a series of image generations. For example, --auto-seeds 5 will generate 5 different images with 5 different random seeds. This is superseded by explicit --seed arguments and seed values in --config-from-metadata files.
--height (optional, int, default: 1024): Height of the output image in pixels.
--width (optional, int, default: 1024): Width of the output image in pixels.
--steps (optional, int, default: 4): Number of inference steps.
--guidance (optional, float, default: 3.5): Guidance scale (only used for "dev" model).
--path (optional, str, default: None): Path to a local model on disk.
--quantize or -q (optional, int, default: None): Quantization (choose between 3, 4, 6, or 8 bits).
--lora-paths (optional, [str], default: None): The paths to the LoRA weights.
--lora-scales (optional, [float], default: None): The scale for each respective LoRA (will default to 1.0 if not specified and only one LoRA weight is loaded.)
--metadata (optional): Exports a .json file containing the metadata for the image with the same name. (Even without this flag, the image metadata is saved and can be viewed using exiftool image.png)
--image-path (optional, str, default: None): Local path to the initial image for image-to-image generation.
--image-strength (optional, float, default: 0.4): Controls how strongly the initial image influences the output image. A value of 0.0 means no influence. (Default is 0.4)
--config-from-metadata or -C (optional, str): [EXPERIMENTAL] Path to a prior file saved via --metadata, or a compatible handcrafted config file adhering to the expected args schema.
--low-ram (optional): Reduces GPU memory usage by limiting the MLX cache size and releasing text encoders and transformer components after use (single image generation only). While this may slightly decrease performance, it helps prevent system memory swapping to disk, allowing image generation on systems with limited RAM.
--lora-name (optional, str, default: None): The name of the LoRA to download from Hugging Face.
--lora-repo-id (optional, str, default: "ali-vilab/In-Context-LoRA"): The Hugging Face repository ID for LoRAs.
--stepwise-image-output-dir (optional, str, default: None): [EXPERIMENTAL] Output directory to write step-wise images and their final composite image to. This feature may change in future versions. When specified, MFLUX will save an image for each denoising step, allowing you to visualize the generation process from noise to final image.

📜 In-Context LoRA Command-Line Arguments

The mflux-generate-in-context command supports most of the same arguments as mflux-generate, with these additional parameters:

--image-path (required, str): Path to the reference image that will guide the style of the generated image.
--lora-style (optional, str, default: None): The style to use for In-Context LoRA generation. Choose from: couple, storyboard, font, home, illustration, portrait, ppt, sandstorm, sparklers, or identity.

See the In-Context LoRA section for more details on how to use this feature effectively.

📜 ControlNet Command-Line Arguments

The mflux-generate-controlnet command supports most of the same arguments as mflux-generate, with these additional parameters:

--controlnet-image-path (required, str): Path to the local image used by ControlNet to guide output generation.
--controlnet-strength (optional, float, default: 0.4): Degree of influence the control image has on the output. Ranges from 0.0 (no influence) to 1.0 (full influence).
--controlnet-save-canny (optional, bool, default: False): If set, saves the Canny edge detection reference image used by ControlNet.

See the Controlnet section for more details on how to use this feature effectively.

📜 Batch Image Generation Arguments

--prompts-file (required, str): Local path for a file that holds a batch of prompts.
--global-seed (optional, int): Entropy Seed used for all prompts in the batch.

📜 Training Arguments

--train-config (optional, str): Local path of the training configuration file. This file defines all aspects of the training process including model parameters, optimizer settings, and training data. See the Training configuration section for details on the structure of this file.
--train-checkpoint (optional, str): Local path of the checkpoint file which specifies how to continue the training process. Used when resuming an interrupted training run.

parameters supported by config files

How configs are used

all config properties are optional and applied to the image generation if applicable
invalid or incompatible properties will be ignored

Config schema

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "seed": {
      "type": ["integer", "null"]
    },
    "steps": {
      "type": ["integer", "null"]
    },
    "guidance": {
      "type": ["number", "null"]
    },
    "quantize": {
      "type": ["null", "string"]
    },
    "lora_paths": {
      "type": ["array", "null"],
      "items": {
        "type": "string"
      }
    },
    "lora_scales": {
      "type": ["array", "null"],
      "items": {
        "type": "number"
      }
    },
    "prompt": {
      "type": ["string", "null"]
    }
  }
}

Example

{
  "model": "dev",
  "seed": 42,
  "steps": 8,
  "guidance": 3.0,
  "quantize": 4,
  "lora_paths": [
    "/some/path1/to/subject.safetensors",
    "/some/path2/to/style.safetensors"
  ],
  "lora_scales": [
    0.8,
    0.4
  ],
  "prompt": "award winning modern art, MOMA"
}

Or, with the correct python environment active, create and run a separate script like the following:

from mflux import Flux1, Config

# Load the model
flux = Flux1.from_name(
   model_name="schnell",  # "schnell" or "dev"
   quantize=8,            # 4 or 8
)

# Generate an image
image = flux.generate_image(
   seed=2,
   prompt="Luxury food photograph",
   config=Config(
      num_inference_steps=2,  # "schnell" works well with 2-4 steps, "dev" works well with 20-25 steps
      height=1024,
      width=1024,
   )
)

image.save(path="image.png")

For more options on how to configure MFLUX, please see generate.py.

⏱️ Image generation speed (updated)

These numbers are based on the non-quantized schnell model, with the configuration provided in the code snippet below. To time your machine, run the following:

time mflux-generate \
--prompt "Luxury food photograph" \
--model schnell \
--steps 2 \
--seed 2 \
--height 1024 \
--width 1024

To find out the spec of your machine (including number of CPU cores, GPU cores, and memory, run the following command:

system_profiler SPHardwareDataType SPDisplaysDataType

Device	M-series	User	Reported Time	Notes
Mac Studio	2023 M2 Ultra	@awni	<15s
Macbook Pro	2024 M4 Max (128GB)	@ivanfioravanti	~19s
Macbook Pro	2023 M3 Max	@karpathy	~20s
-	2023 M2 Max (96GB)	@explorigin	~25s
Mac Mini	2024 M4 Pro (64GB)	@Stoobs	~34s
Mac Mini	2023 M2 Pro (32GB)	@leekichko	~54s
-	2022 M1 MAX (64GB)	@BosseParra	~55s
Macbook Pro	2023 M2 Max (32GB)	@filipstrand	~70s
-	2023 M3 Pro (36GB)	@kush-gupt	~80s
Mac Mini	2024 M4 (16GB)	@wnma3mz	~97s	512 x 512, 8-bit quantization
Macbook Pro	2021 M1 Pro (32GB)	@filipstrand	~160s
-	2021 M1 Pro (16GB)	@qw-in	~175s	Might freeze your mac
Macbook Air	2020 M1 (8GB)	@mbvillaverde	~335s	With resolution 512 x 512

Note that these numbers includes starting the application from scratch, which means doing model i/o, setting/quantizing weights etc. If we assume that the model is already loaded, you can inspect the image metadata using exiftool image.png and see the total duration of the denoising loop (excluding text embedding).

These benchmarks are not very scientific and is only intended to give ballpark numbers. They were performed during different times with different MFLUX and MLX-versions etc. Additional hardware information such as number of GPU cores, Mac device etc. are not always known.

↔️ Equivalent to Diffusers implementation

There is only a single source of randomness when generating an image: The initial latent array. In this implementation, this initial latent is fully deterministically controlled by the input seed parameter. However, if we were to import a fixed instance of this latent array saved from the Diffusers implementation, then MFLUX will produce an identical image to the Diffusers implementation (assuming a fixed prompt and using the default parameter settings in the Diffusers setup).

The images below illustrate this equivalence. In all cases the Schnell model was run for 2 time steps. The Diffusers implementation ran in CPU mode. The precision for MFLUX can be set in the Config class. There is typically a noticeable but very small difference in the final image when switching between 16bit and 32bit precision.

Luxury food photograph

detailed cinematic dof render of an old dusty detailed CRT monitor on a wooden desk in a dim room with items around, messy dirty room. On the screen are the letters "FLUX" glowing softly. High detail hard surface render

photorealistic, lotr, A tiny red dragon curled up asleep inside a nest, (Soft Focus) , (f_stop 2.8) , (focal_length 50mm) macro lens f/2. 8, medieval wizard table, (pastel) colors, (cozy) morning light filtering through a nearby window, (whimsical) steam shapes, captured with a (Canon EOS R5) , highlighting (serene) comfort, medieval, dnd, rpg, 3d, 16K, 8K

A weathered fisherman in his early 60s stands on the deck of his boat, gazing out at a stormy sea. He has a thick, salt-and-pepper beard, deep-set blue eyes, and skin tanned and creased from years of sun exposure. He's wearing a yellow raincoat and hat, with water droplets clinging to the fabric. Behind him, dark clouds loom ominously, and waves crash against the side of the boat. The overall atmosphere is one of tension and respect for the power of nature.

Luxury food photograph of an italian Linguine pasta alle vongole dish with lots of clams. It has perfect lighting and a cozy background with big bokeh and shallow depth of field. The mood is a sunset balcony in tuscany.  The photo is taken from the side of the plate. The pasta is shiny with sprinkled parmesan cheese and basil leaves on top. The scene is complemented by a warm, inviting light that highlights the textures and colors of the ingredients, giving it an appetizing and elegant look.

🗜️ Quantization

MFLUX supports running FLUX in 3, 4, 6, or 8-bit quantized mode. Running a quantized version can greatly speed up the generation process and reduce the memory consumption by several gigabytes. Quantized models also take up less disk space.

mflux-generate \
    --model schnell \
    --steps 2 \
    --seed 2 \
    --quantize 8 \
    --height 1920 \
    --width 1024 \
    --prompt "Tranquil pond in a bamboo forest at dawn, the sun is barely starting to peak over the horizon, panda practices Tai Chi near the edge of the pond, atmospheric perspective through the mist of morning dew, sunbeams, its movements are graceful and fluid — creating a sense of harmony and balance, the pond's calm waters reflecting the scene, inviting a sense of meditation and connection with nature, style of Howard Terpning and Jessica Rossier"

In this example, weights are quantized at runtime - this is convenient if you don't want to save a quantized copy of the weights to disk, but still want to benefit from the potential speedup and RAM reduction quantization might bring.

By selecting the --quantize or -q flag to be 4, 8, or removing it entirely, we get all 3 images above. As can be seen, there is very little difference between the images (especially between the 8-bit, and the non-quantized result). Image generation times in this example are based on a 2021 M1 Pro (32GB) machine. Even though the images are almost identical, there is a ~2x speedup by running the 8-bit quantized version on this particular machine. Unlike the non-quantized version, for the 8-bit version the swap memory usage is drastically reduced and GPU utilization is close to 100% during the whole generation. Results here can vary across different machines.

For systems with limited RAM, you can also use the --low-ram option which reduces GPU memory usage by constraining the MLX cache size and releasing text encoders and transformer components after use. This option is particularly helpful for preventing system memory swapping to disk on machines with less available RAM.

📊 Size comparisons for quantized models

The model sizes for both schnell and dev at various quantization levels are as follows:

3 bit	4 bit	6 bit	8 bit	Original (16 bit)
7.52GB	9.61GB	13.81GB	18.01GB	33.73GB

💾 Saving a quantized version to disk

To save a local copy of the quantized weights, run the mflux-save command like so:

mflux-save \
    --path "/Users/filipstrand/Desktop/schnell_8bit" \
    --model schnell \
    --quantize 8

Note that when saving a quantized version, you will need the original huggingface weights.

It is also possible to specify LoRA adapters when saving the model, e.g

mflux-save \
    --path "/Users/filipstrand/Desktop/schnell_8bit" \
    --model schnell \
    --quantize 8 \
    --lora-paths "/path/to/lora.safetensors" \
    --lora-scales 0.7

When generating images with a model like this, no LoRA adapter is needed to be specified since it is already baked into the saved quantized weights.

💽 Loading and running a quantized version from disk

To generate a new image from the quantized model, simply provide a --path to where it was saved:

mflux-generate \
    --path "/Users/filipstrand/Desktop/schnell_8bit" \
    --model schnell \
    --steps 2 \
    --seed 2 \
    --height 1920 \
    --width 1024 \
    --prompt "Tranquil pond in a bamboo forest at dawn, the sun is barely starting to peak over the horizon, panda practices Tai Chi near the edge of the pond, atmospheric perspective through the mist of morning dew, sunbeams, its movements are graceful and fluid — creating a sense of harmony and balance, the pond's calm waters reflecting the scene, inviting a sense of meditation and connection with nature, style of Howard Terpning and Jessica Rossier"

Note: When loading a quantized model from disk, there is no need to pass in -q flag, since we can infer this from the weight metadata.

Also Note: Once we have a local model (quantized or not) specified via the --path argument, the huggingface cache models are not required to launch the model. In other words, you can reclaim the 34GB diskspace (per model) by deleting the full 16-bit model from the Huggingface cache if you choose.

⚠️ * Quantized models saved with mflux < v.0.6.0 will not work with v.0.6.0 and later due to updated implementation. The solution is to save a new quantized local copy

If you don't want to download the full models and quantize them yourself, the 4-bit weights are available here for a direct download:

For mflux < v.0.6.0:
- madroid/flux.1-schnell-mflux-4bit
- madroid/flux.1-dev-mflux-4bit
For mflux >= v.0.6.0:
- dhairyashil/FLUX.1-schnell-mflux-v0.6.2-4bit
- dhairyashil/FLUX.1-dev-mflux-4bit

Using the community model support, the quantized weights can be also be automatically downloaded

mflux-generate \
    --model "dhairyashil/FLUX.1-schnell-mflux-v0.6.2-4bit" \
    --base-model schnell \
    --steps 2 \
    --seed 2 \
    --prompt "Luxury food photograph"

💽 Running a non-quantized model directly from disk

MFLUX also supports running a non-quantized model directly from a custom location. In the example below, the model is placed in /Users/filipstrand/Desktop/schnell:

mflux-generate \
    --path "/Users/filipstrand/Desktop/schnell" \
    --model schnell \
    --steps 2 \
    --seed 2 \
    --prompt "Luxury food photograph"

Note that the --model flag must be set when loading a model from disk.

Also note that unlike when using the typical alias way of initializing the model (which internally handles that the required resources are downloaded), when loading a model directly from disk, we require the downloaded models to look like the following:

.
├── text_encoder
│   └── model.safetensors
├── text_encoder_2
│   ├── model-00001-of-00002.safetensors
│   └── model-00002-of-00002.safetensors
├── tokenizer
│   ├── merges.txt
│   ├── special_tokens_map.json
│   ├── tokenizer_config.json
│   └── vocab.json
├── tokenizer_2
│   ├── special_tokens_map.json
│   ├── spiece.model
│   ├── tokenizer.json
│   └── tokenizer_config.json
├── transformer
│   ├── diffusion_pytorch_model-00001-of-00003.safetensors
│   ├── diffusion_pytorch_model-00002-of-00003.safetensors
│   └── diffusion_pytorch_model-00003-of-00003.safetensors
└── vae
    └── diffusion_pytorch_model.safetensors

This mirrors how the resources are placed in the HuggingFace Repo for FLUX.1. Huggingface weights, unlike quantized ones exported directly from this project, have to be processed a bit differently, which is why we require this structure above.

🌐 Third-Party HuggingFace Model Support

MFLUX now supports compatible third-party models from HuggingFace that follow the FLUX architecture. This opens up the ecosystem to community-created models that may offer different capabilities, sizes, or specializations.

To use a third-party model, specify the HuggingFace repository ID with the --model parameter and indicate which base architecture (dev or schnell) it's derived from using the --base-model parameter:

mflux-generate \
    --model Freepik/flux.1-lite-8B \
    --base-model schnell \
    --steps 4 \
    --seed 42 \
    --prompt "A beautiful landscape with mountains and a lake"

Some examples of compatible third-party models include:

Freepik/flux.1-lite-8B-alpha - A lighter version of FLUX
shuttleai/shuttle-3-diffusion - Shuttle's implementation based on FLUX

The model will be automatically downloaded from HuggingFace the first time you use it, similar to the official FLUX models.

Note: Third-party models may have different performance characteristics, capabilities, or limitations compared to the official FLUX models. Always refer to the model's documentation on HuggingFace for specific usage instructions.

🎨 Image-to-Image

One way to condition the image generation is by starting from an existing image and let MFLUX produce new variations. Use the --image-path flag to specify the reference image, and the --image-strength to control how much the reference image should guide the generation. For example, given the reference image below, the following command produced the first image using the Sketching LoRA:

mflux-generate \
--prompt "sketching of an Eiffel architecture, masterpiece, best quality. The site is lit by lighting professionals, creating a subtle illumination effect. Ink on paper with very fine touches with colored markers, (shadings:1.1), loose lines, Schematic, Conceptual, Abstract, Gestural. Quick sketches to explore ideas and concepts." \
--image-path "reference.png" \
--image-strength 0.3 \
--lora-paths Architectural_Sketching.safetensors \
--lora-scales 1.0 \
--model dev \
--steps 20 \
--seed 43 \
--guidance 4.0 \
--quantize 8 \
--height 1024 \
--width 1024

Like with Controlnet, this technique combines well with LoRA adapters:

In the examples above the following LoRAs are used Sketching, Animation Shot and flux-film-camera are used.

🔌 LoRA

MFLUX support loading trained LoRA adapters (actual training support is coming).

The following example The_Hound LoRA from @TheLastBen:

mflux-generate --prompt "sandor clegane" --model dev --steps 20 --seed 43 -q 8 --lora-paths "sandor_clegane_single_layer.safetensors"

The following example is Flux_1_Dev_LoRA_Paper-Cutout-Style LoRA from @Norod78:

mflux-generate --prompt "pikachu, Paper Cutout Style" --model schnell --steps 4 --seed 43 -q 8 --lora-paths "Flux_1_Dev_LoRA_Paper-Cutout-Style.safetensors"

Note that LoRA trained weights are typically trained with a trigger word or phrase. For example, in the latter case, the sentence should include the phrase "Paper Cutout Style".

Also note that the same LoRA weights can work well with both the schnell and dev models. Refer to the original LoRA repository to see what mode it was trained for.

Multi-LoRA

Multiple LoRAs can be sent in to combine the effects of the individual adapters. The following example combines both of the above LoRAs:

mflux-generate \
   --prompt "sandor clegane in a forest, Paper Cutout Style" \
   --model dev \
   --steps 20 \
   --seed 43 \
   --lora-paths sandor_clegane_single_layer.safetensors Flux_1_Dev_LoRA_Paper-Cutout-Style.safetensors \
   --lora-scales 1.0 1.0 \
   -q 8

Just to see the difference, this image displays the four cases: One of having both adapters fully active, partially active and no LoRA at all. The example above also show the usage of --lora-scales flag.

Supported LoRA formats (updated)

Since different fine-tuning services can use different implementations of FLUX, the corresponding LoRA weights trained on these services can be different from one another. The aim of MFLUX is to support the most common ones. The following table show the current supported formats:

Supported	Name	Example	Notes
✅	BFL	civitai - Impressionism	Many things on civitai seem to work
✅	Diffusers	Flux_1_Dev_LoRA_Paper-Cutout-Style
❌	XLabs-AI	flux-RealismLora

To report additional formats, examples or other any suggestions related to LoRA format support, please see issue #47.

🎭 In-Context LoRA

In-Context LoRA is a powerful technique that allows you to generate images in a specific style based on a reference image, without requiring model fine-tuning. This approach uses specialized LoRA weights that enable the model to understand and apply the visual context from your reference image to a new generation.

This feature is based on the In-Context LoRA for Diffusion Transformers project by Ali-ViLab.

To use In-Context LoRA, you need:

A reference image
A style LoRA (optional - the in-context ability works without LoRAs, but they can significantly enhance the results)

Available Styles

MFLUX provides several pre-defined styles from the Hugging Face ali-vilab/In-Context-LoRA repository that you can use with the --lora-style argument:

Style Name	Description
`couple`	Couple profile photography style
`storyboard`	Film storyboard sketching style
`font`	Font design and typography style
`home`	Home decoration and interior design style
`illustration`	Portrait illustration style
`portrait`	Portrait photography style
`ppt`	Presentation template style
`sandstorm`	Sandstorm visual effect
`sparklers`	Sparklers visual effect
`identity`	Visual identity and branding design style

How It Works

The In-Context LoRA generation creates a side-by-side image where:

The left side shows your reference image with noise applied
The right side shows the new generation that follows your prompt while maintaining the visual context

The final output is automatically cropped to show only the right half (the generated image).

Prompting for In-Context LoRA

For best results with In-Context LoRA, your prompt should describe both the reference image and the target image you want to generate. Use markers like [IMAGE1], [LEFT], or [RIGHT] to distinguish between the two parts.

Here's an example:

mflux-generate-in-context \
  --model dev \
  --steps 20 \
  --quantize 8 \
  --seed 42 \
  --height 1024 \
  --width 1024 \
  --image-path "reference.png" \
  --lora-style identity \
  --prompt "In this set of two images, a bold modern typeface with the brand name 'DEMA' is introduced and is shown on a company merchandise product photo; [IMAGE1] a simplistic black logo featuring a modern typeface with the brand name 'DEMA' on a bright light green/yellowish background; [IMAGE2] the design is printed on a green/yellowish hoodie as a company merchandise product photo with a plain white background."

This prompt clearly describes both the reference image (after [IMAGE1]) and the desired output (after [IMAGE2]). Other marker pairs you can use include:

[LEFT] and [RIGHT]
[TOP] and [BOTTOM]
[REFERENCE] and [OUTPUT]

Important: In the current implementation, the reference image is ALWAYS placed on the left side of the composition, and the generated image on the right side. When using marker pairs in your prompt, the first marker (e.g., [IMAGE1], [LEFT], [REFERENCE]) always refers to your reference image, while the second marker (e.g., [IMAGE2], [RIGHT], [OUTPUT]) refers to what you want to generate.

Tips for Best Results

Choose the right reference image: Select a reference image with a clear composition and structure that matches your intended output.
Adjust guidance: Higher guidance values (7.0-9.0) tend to produce results that more closely follow your prompt.
Try different styles: Each style produces distinctly different results - experiment to find the one that best matches your vision.
Increase steps: For higher quality results, use 25-30 steps.
Detailed prompting: Be specific about both the reference image and your desired output in your prompt.
Try without LoRA: While LoRAs enhance the results, you can experiment without them to see the base in-context capabilities.

🛠️ Flux Tools

MFLUX supports the official Flux.1 Tools.

🖌️ Fill

The Fill tool uses the FLUX.1-Fill-dev model to allow you to selectively edit parts of an image by providing a binary mask. This is useful for inpainting (replacing specific areas) and outpainting (expanding the canvas).

Inpainting

Inpainting allows you to selectively regenerate specific parts of an image while preserving the rest. This is perfect for removing unwanted objects, adding new elements, or changing specific areas of an image without affecting the surrounding content. The Fill tool understands the context of the entire image and creates seamless edits that blend naturally with the preserved areas.

Original dog image credit: Julio Bernal on Unsplash

Creating Masks

Before using the Fill tool, you need an image and a corresponding mask. You can create a mask using the included tool:

python -m tools.fill_mask_tool /path/to/your/image.jpg

This will open an interactive interface where you can paint over the areas you want to regenerate. Pressing the s key will save the mask at the same location as the image.

Example

To regenerate specific parts of an image:

mflux-generate-fill \
  --prompt "A professionally shot studio photograph of a dog wearing a red hat and ski goggles. The dog is centered against a uniformly bright yellow background, with well-balanced lighting and sharp details." \
  --steps 20 \
  --seed 42 \
  --height 1280 \
  --width 851 \
  --guidance 30 \
  -q 8 \
  --image-path "dog.png" \
  --masked-image-path "dog_mask.png" \

Outpainting

Original room image credit: Alexey Aladashvili on Unsplash

Outpainting extends your image beyond its original boundaries, allowing you to expand the canvas in any direction while maintaining visual consistency with the original content. This is useful for creating wider landscapes, revealing more of a scene, or transforming a portrait into a full-body image. The Fill tool intelligently generates new content that seamlessly connects with the existing image.

You can expand the canvas of your image using the provided tool:

python -m tools.create_outpaint_image_canvas_and_mask \
  /path/to/your/image.jpg \
  --image-outpaint-padding "0,30%,20%,0"

As an example, here's how to add 25% padding to both the left and right sides of an image:

python -m tools.create_outpaint_image_canvas_and_mask \
  room.png \
  --image-outpaint-padding "0,25%,0,25%"

The padding format is "top,right,bottom,left" where each value can be in pixels or as a percentage. For example, "0,30%,20%,0" expands the canvas by 30% to the right and 20% to the bottom.

After running this command, you'll get two files: an expanded canvas with your original image and a corresponding mask. You can then run the mflux-generate-fill command similar to the inpainting example, using these files as input.

Example

Once you've created the expanded canvas and mask files, run the Fill tool on them:

mflux-generate-fill \
  --prompt "A detailed interior room photograph with natural lighting, extended in a way that perfectly complements the original space. The expanded areas continue the architectural style, color scheme, and overall aesthetic of the room seamlessly." \
  --steps 25 \
  --seed 43 \
  --guidance 30 \
  -q 8 \
  --image-path "room.png" \
  --masked-image-path "room_mask.png" \

Tips for Best Results

Model: The model is always FLUX.1-Fill-dev and should not be specified.
Guidance: Higher guidance values (around 30) typically yield better results. This is the default if not specified.
Resolution: Higher resolution images generally produce better results with the Fill tool.
Masks: Make sure your mask clearly defines the areas you want to regenerate. White areas in the mask will be regenerated, while black areas will be preserved.
Prompting: For best results, provide detailed prompts that describe both what you want in the newly generated areas and how it should relate to the preserved parts of the image.
Steps: Using 20-30 denoising steps generally produces higher quality results.

⚠️ Note: Using the Fill tool requires an additional 33.92 GB download from black-forest-labs/FLUX.1-Fill-dev. The download happens automatically on first use.

🎛️ Dreambooth fine-tuning

As of release v.0.5.0, MFLUX has support for fine-tuning your own LoRA adapters using the Dreambooth technique.

This example shows the MFLUX training progression of the included training example which is based on the DreamBooth Dataset, also used in the mlx-examples repo.

Training configuration

To describe a training run, you need to provide a training configuration file which specifies the details such as what training data to use and various parameters. To try it out, one of the easiest ways is to start from the provided example configuration and simply use your own dataset and prompts by modifying the examples section of the json file.

Training example

A complete example (training configuration + dataset) is provided in this repository. To start a training run, go to the project folder cd mflux, and simply run:

mflux-train --train-config src/mflux/dreambooth/_example/train.json

By default, this will train an adapter with images of size 512x512 with a batch size of 1 and can take up to several hours to fully complete depending on your machine. If this task is too computationally demanding, see the section on memory issues for tips on how to speed things up and what tradeoffs exist.

During training, MFLUX will output training checkpoints with artifacts (weights, states) according to what is specified in the configuration file. As specified in the file train.json, these files will be placed in a folder on the Desktop called ~/Desktop/train, but this can of course be changed to any other path by adjusting the configuration. All training artifacts will be saved as self-contained zip file, which can later be pointed to resume an existing training run. To find the LoRA weights, simply unzip and look for the adapter safetensors file and use it as you would with a regular downloaded LoRA adapter.

Resuming a training run

The training process will continue to run until each training example has been used num_epochs times. For various reasons however, the user might choose to interrupt the process. To resume training for a given checkpoint, say 0001000_checkpoint.zip, simply run:

mflux-train --train-checkpoint 0001000_checkpoint.zip

This uses the --train-checkpoint command-line argument to specify the checkpoint file to resume from.

There are two nice properties of the training procedure:

Fully deterministic (given a specified seed in the training configuration)
The complete training state (including optimizer state) is saved at each checkpoint.

Because of these, MFLUX has the ability to resume a training run from a previous checkpoint and have the results be exactly identical to a training run which was never interrupted in the first place.

⚠️ Note: Everything but the dataset itself is contained within this zip file, as the dataset can be quite large. The zip file will contain configuration files which point to the original dataset, so make sure that it is in the same place when resuming training.

⚠️ Note: One current limitation is that a training run can only be resumed if it has not yet been completed. In other words, only checkpoints that represent an interrupted training-run can be resumed and run until completion.

Configuration details

Currently, MFLUX supports fine-tuning only for the transformer part of the model. In the training configuration, under lora_layers, you can specify which layers you want to train. The available ones are:

transformer_blocks:
- attn.to_q
- attn.to_k
- attn.to_v
- attn.add_q_proj
- attn.add_k_proj
- attn.add_v_proj
- attn.to_out
- attn.to_add_out
- ff.linear1
- ff.linear2
- ff_context.linear1
- ff_context.linear2
single_transformer_blocks:
- proj_out
- proj_mlp
- attn.to_q
- attn.to_k
- attn.to_v

The block_range under the respective layer category specifies which blocks to train. The maximum range available for the different layer categories are:

transformer_blocks:
- start: 0
- end: 19
single_transformer_blocks:
- start: 0
- end: 38

Specify individual layers

For even more precision, you can specify individual block indices to train like so:

"lora_layers": {
  "single_transformer_blocks": {
    "block_range": {
      "indices": [
        0,
        1,
        7,
        19,
        20
      ],
      ...
  },
...

⚠️ Note: As the joint transformer blocks (transformer_blocks) - are placed earlier on in the sequence of computations, they will require more resources to train. In other words, training later layers, such as only the single_transformer_blocks should be faster. However, training too few / only later layers might result in a faster but unsuccessful training.

Under the examples section, there is an argument called "path" which specifies where the images are located. This path is relative to the config file itself.

Memory issues

Depending on the configuration of the training setup, fine-tuning can be quite memory intensive. In the worst case, if your Mac runs out of memory it might freeze completely and crash!

To avoid this, consider some of the following strategies to reduce memory requirements by adjusting the parameters in the training configuration:

Use a quantized based model by setting "quantize": 4 or "quantize": 8
For the layer_types, consider skipping some of the trainable layers (e.g. by not including proj_out etc.)
Use a lower rank value for the LoRA matrices.
Don't train all the 38 layers from single_transformer_blocks or all of the 19 layers from transformer_blocks
Use a smaller batch size, for example "batch_size": 1
Make sure your Mac is not busy with other background tasks that holds memory.

Applying some of these strategies, like how train.json is set up by default, will allow a 32GB M1 Pro to perform a successful fine-tuning run. Note, however, that reducing the trainable parameters might lead to worse performance.

Additional techniques such as gradient checkpoint and other strategies might be implemented in the future.

Misc

This feature is currently v1 and can be considered a bit experimental. Interfaces might change (configuration file setup etc.) The aim is to also gradually expand the scope of this feature with alternative techniques, data augmentation etc.

As with loading external LoRA adapters, the MFLUX training currently only supports training the transformer part of the network.
Sometimes, a model trained with the dev model might actually work better when applied to the schnell weights.
Currently, all training images are assumed to be in the resolution specified in the configuration file.
Loss curve can be a bit misleading/hard to read, sometimes it conveys little improvement over time, but actual image samples show the real progress.
When plotting the loss during training, we label it as "validation loss" but it is actually only the first 10 elements of the training examples for now. Future updates should support user inputs of separate validation images.
Training also works with the original model as quantized!
For the curious, a motivation for the loss function can be found here.
Two great resources that heavily inspired this feature are:
- The fine-tuning script in mlx-examples
- The original fine-tuning script in Diffusers

🕹️ Controlnet

MFLUX has Controlnet support for an even more fine-grained control of the image generation. By providing a reference image via --controlnet-image-path and a strength parameter via --controlnet-strength, you can guide the generation toward the reference image.

mflux-generate-controlnet \
  --prompt "A comic strip with a joker in a purple suit" \
  --model dev \
  --steps 20 \
  --seed 1727047657 \
  --height 1066 \
  --width 692 \
  -q 8 \
  --lora-paths "Dark Comic - s0_8 g4.safetensors" \
  --controlnet-image-path "reference.png" \
  --controlnet-strength 0.5 \
  --controlnet-save-canny

This example combines the controlnet reference image with the LoRA Dark Comic Flux.

⚠️ Note: Controlnet requires an additional one-time download of ~3.58GB of weights from Huggingface. This happens automatically the first time you run the generate-controlnet command. At the moment, the Controlnet used is InstantX/FLUX.1-dev-Controlnet-Canny, which was trained for the dev model. It can work well with schnell, but performance is not guaranteed.

⚠️ Note: The output can be highly sensitive to the controlnet strength and is very much dependent on the reference image. Too high settings will corrupt the image. A recommended starting point a value like 0.4 and to play around with the strength.

Controlnet can also work well together with LoRA adapters. In the example below the same reference image is used as a controlnet input with different prompts and LoRA adapters active.

🚧 Current limitations

Images are generated one by one.
Negative prompts not supported.
LoRA weights are only supported for the transformer part of the network.
Some LoRA adapters does not work.
Currently, the supported controlnet is the canny-only version.
Dreambooth training currently does not support sending in training parameters as flags.
In-Context LoRA currently only supports a left-right image setup (reference image on left, generated image on right).

Optional Tool: Batch Image Renamer

With a large number of generated images, some users want to automatically rename their image outputs to reflect the prompts and configs.

The bundled tools/rename_images.py is an optional tool that is part with the project repo but not included in the mflux Python package due to additional dependencies that do not make sense to become standard project requirements.

The script uses KeyBERT (a keyword extraction library) to extract keywords from mflux exif metadata to update the image file names. We then use uv run to execute the script in an isolated env without affecting your mflux env.

Users who want to use or extend this tool to their own needs is encouraged to git clone the repo then uv run tools/rename_images.py <paths> or download the single-file standalone script and uv run your/path/rename_images.py.

This script's renaming logic can be customized to your needs. See uv run tools/rename_images.py --help for full CLI usage help.

💡Workflow Tips

To hide the model fetching status progress bars, export HF_HUB_DISABLE_PROGRESS_BARS=1
Use config files to save complex job parameters in a file instead of passing many --args
Set up shell aliases for required args examples:
- shortcut for dev model: alias mflux-dev='mflux-generate --model dev'
- shortcut for schnell model and always save metadata: alias mflux-schnell='mflux-generate --model schnell --metadata'
For systems with limited memory, use the --low-ram flag to reduce memory usage by constraining the MLX cache size and releasing components after use
When generating multiple images with different seeds, use --seed with multiple values or --auto-seeds to automatically generate a series of random seeds
Use --stepwise-image-output-dir to save intermediate images at each denoising step, which can be useful for debugging or creating animations of the generation process

✅ TODO

[ ] FLUX.1 Tools

🔬 Cool research / features to support

[ ] ConceptAttention
[ ] PuLID
[ ] depth based controlnet via ml-depth-pro or similar?
[ ] RF-Inversion
[ ] catvton-flux

🌱‍ Related projects

License

This project is licensed under the MIT License.

For Tasks:

Click tags to check more tools for each tasks

generate images install and setup run tests fine-tune models control image generation

For Jobs:

machine learning engineer data scientist ai researcher software developer computer vision engineer

Alternative AI tools for mflux

Similar Open Source Tools

mflux

github

: 1.3k

shellChatGPT

ShellChatGPT is a shell wrapper for OpenAI's ChatGPT, DALL-E, Whisper, and TTS, featuring integration with LocalAI, Ollama, Gemini, Mistral, Groq, and GitHub Models. It provides text and chat completions, vision, reasoning, and audio models, voice-in and voice-out chatting mode, text editor interface, markdown rendering support, session management, instruction prompt manager, integration with various service providers, command line completion, file picker dialogs, color scheme personalization, stdin and text file input support, and compatibility with Linux, FreeBSD, MacOS, and Termux for a responsive experience.

github

: 71

mergekit

Mergekit is a toolkit for merging pre-trained language models. It uses an out-of-core approach to perform unreasonably elaborate merges in resource-constrained situations. Merges can be run entirely on CPU or accelerated with as little as 8 GB of VRAM. Many merging algorithms are supported, with more coming as they catch my attention.

github

: 5.5k

generative-models

Generative Models by Stability AI is a repository that provides various generative models for research purposes. It includes models like Stable Video 4D (SV4D) for video synthesis, Stable Video 3D (SV3D) for multi-view synthesis, SDXL-Turbo for text-to-image generation, and more. The repository focuses on modularity and implements a config-driven approach for building and combining submodules. It supports training with PyTorch Lightning and offers inference demos for different models. Users can access pre-trained models like SDXL-base-1.0 and SDXL-refiner-1.0 under a CreativeML Open RAIL++-M license. The codebase also includes tools for invisible watermark detection in generated images.

github

: 23.6k

LeanCopilot

Lean Copilot is a tool that enables the use of large language models (LLMs) in Lean for proof automation. It provides features such as suggesting tactics/premises, searching for proofs, and running inference of LLMs. Users can utilize built-in models from LeanDojo or bring their own models to run locally or on the cloud. The tool supports platforms like Linux, macOS, and Windows WSL, with optional CUDA and cuDNN for GPU acceleration. Advanced users can customize behavior using Tactic APIs and Model APIs. Lean Copilot also allows users to bring their own models through ExternalGenerator or ExternalEncoder. The tool comes with caveats such as occasional crashes and issues with premise selection and proof search. Users can get in touch through GitHub Discussions for questions, bug reports, feature requests, and suggestions. The tool is designed to enhance theorem proving in Lean using LLMs.

github

: 1.0k

rtdl-num-embeddings

This repository provides the official implementation of the paper 'On Embeddings for Numerical Features in Tabular Deep Learning'. It focuses on transforming scalar continuous features into vectors before integrating them into the main backbone of tabular neural networks, showcasing improved performance. The embeddings for continuous features are shown to enhance the performance of tabular DL models and are applicable to various conventional backbones, offering efficiency comparable to Transformer-based models. The repository includes Python packages for practical usage, exploration of metrics and hyperparameters, and reproducing reported results for different algorithms and datasets.

github

: 287

web-llm

WebLLM is a modular and customizable javascript package that directly brings language model chats directly onto web browsers with hardware acceleration. Everything runs inside the browser with no server support and is accelerated with WebGPU. WebLLM is fully compatible with OpenAI API. That is, you can use the same OpenAI API on any open source models locally, with functionalities including json-mode, function-calling, streaming, etc. We can bring a lot of fun opportunities to build AI assistants for everyone and enable privacy while enjoying GPU acceleration.

github

: 16.4k

py-vectara-agentic

The `vectara-agentic` Python library is designed for developing powerful AI assistants using Vectara and Agentic-RAG. It supports various agent types, includes pre-built tools for domains like finance and legal, and enables easy creation of custom AI assistants and agents. The library provides tools for summarizing text, rephrasing text, legal tasks like summarizing legal text and critiquing as a judge, financial tasks like analyzing balance sheets and income statements, and database tools for inspecting and querying databases. It also supports observability via LlamaIndex and Arize Phoenix integration.

github

: 98

ell

ell is a lightweight, functional prompt engineering framework that treats prompts as programs rather than strings. It provides tools for prompt versioning, monitoring, and visualization, as well as support for multimodal inputs and outputs. The framework aims to simplify the process of prompt engineering for language models.

github

: 4.9k

storm

STORM is a LLM system that writes Wikipedia-like articles from scratch based on Internet search. While the system cannot produce publication-ready articles that often require a significant number of edits, experienced Wikipedia editors have found it helpful in their pre-writing stage. **Try out our [live research preview](https://storm.genie.stanford.edu/) to see how STORM can help your knowledge exploration journey and please provide feedback to help us improve the system 🙏!**

github

: 17.0k

garak

Garak is a free tool that checks if a Large Language Model (LLM) can be made to fail in a way that is undesirable. It probes for hallucination, data leakage, prompt injection, misinformation, toxicity generation, jailbreaks, and many other weaknesses. Garak's a free tool. We love developing it and are always interested in adding functionality to support applications.

github

: 1.3k

WindowsAgentArena

Windows Agent Arena (WAA) is a scalable Windows AI agent platform designed for testing and benchmarking multi-modal, desktop AI agents. It provides researchers and developers with a reproducible and realistic Windows OS environment for AI research, enabling testing of agentic AI workflows across various tasks. WAA supports deploying agents at scale using Azure ML cloud infrastructure, allowing parallel running of multiple agents and delivering quick benchmark results for hundreds of tasks in minutes.

github

: 147

HuggingFaceGuidedTourForMac

HuggingFaceGuidedTourForMac is a guided tour on how to install optimized pytorch and optionally Apple's new MLX, JAX, and TensorFlow on Apple Silicon Macs. The repository provides steps to install homebrew, pytorch with MPS support, MLX, JAX, TensorFlow, and Jupyter lab. It also includes instructions on running large language models using HuggingFace transformers. The repository aims to help users set up their Macs for deep learning experiments with optimized performance.

github

: 79

allms

allms is a versatile and powerful library designed to streamline the process of querying Large Language Models (LLMs). Developed by Allegro engineers, it simplifies working with LLM applications by providing a user-friendly interface, asynchronous querying, automatic retrying mechanism, error handling, and output parsing. It supports various LLM families hosted on different platforms like OpenAI, Google, Azure, and GCP. The library offers features for configuring endpoint credentials, batch querying with symbolic variables, and forcing structured output format. It also provides documentation, quickstart guides, and instructions for local development, testing, updating documentation, and making new releases.

github

: 82

kvpress

This repository implements multiple key-value cache pruning methods and benchmarks using transformers, aiming to simplify the development of new methods for researchers and developers in the field of long-context language models. It provides a set of 'presses' that compress the cache during the pre-filling phase, with each press having a compression ratio attribute. The repository includes various training-free presses, special presses, and supports KV cache quantization. Users can contribute new presses and evaluate the performance of different presses on long-context datasets.

github

: 600

datadreamer

DataDreamer is an advanced toolkit designed to facilitate the development of edge AI models by enabling synthetic data generation, knowledge extraction from pre-trained models, and creation of efficient and potent models. It eliminates the need for extensive datasets by generating synthetic datasets, leverages latent knowledge from pre-trained models, and focuses on creating compact models suitable for integration into any device and performance for specialized tasks. The toolkit offers features like prompt generation, image generation, dataset annotation, and tools for training small-scale neural networks for edge deployment. It provides hardware requirements, usage instructions, available models, and limitations to consider while using the library.

github

: 77

For similar tasks

mflux

github

: 1.3k

mindsdb

MindsDB is a platform for customizing AI from enterprise data. You can create, serve, and fine-tune models in real-time from your database, vector store, and application data. MindsDB "enhances" SQL syntax with AI capabilities to make it accessible for developers worldwide. With MindsDB’s nearly 200 integrations, any developer can create AI customized for their purpose, faster and more securely. Their AI systems will constantly improve themselves — using companies’ own data, in real-time.

github

: 36.1k

training-operator

Kubeflow Training Operator is a Kubernetes-native project for fine-tuning and scalable distributed training of machine learning (ML) models created with various ML frameworks such as PyTorch, Tensorflow, XGBoost, MPI, Paddle and others. Training Operator allows you to use Kubernetes workloads to effectively train your large models via Kubernetes Custom Resources APIs or using Training Operator Python SDK. > Note: Before v1.2 release, Kubeflow Training Operator only supports TFJob on Kubernetes. * For a complete reference of the custom resource definitions, please refer to the API Definition. * TensorFlow API Definition * PyTorch API Definition * Apache MXNet API Definition * XGBoost API Definition * MPI API Definition * PaddlePaddle API Definition * For details of all-in-one operator design, please refer to the All-in-one Kubeflow Training Operator * For details on its observability, please refer to the monitoring design doc.

github

: 1.7k

helix

HelixML is a private GenAI platform that allows users to deploy the best of open AI in their own data center or VPC while retaining complete data security and control. It includes support for fine-tuning models with drag-and-drop functionality. HelixML brings the best of open source AI to businesses in an ergonomic and scalable way, optimizing the tradeoff between GPU memory and latency.

github

: 519

nntrainer

NNtrainer is a software framework for training neural network models on devices with limited resources. It enables on-device fine-tuning of neural networks using user data for personalization. NNtrainer supports various machine learning algorithms and provides examples for tasks such as few-shot learning, ResNet, VGG, and product rating. It is optimized for embedded devices and utilizes CBLAS and CUBLAS for accelerated calculations. NNtrainer is open source and released under the Apache License version 2.0.

github

: 135

petals

Petals is a tool that allows users to run large language models at home in a BitTorrent-style manner. It enables fine-tuning and inference up to 10x faster than offloading. Users can generate text with distributed models like Llama 2, Falcon, and BLOOM, and fine-tune them for specific tasks directly from their desktop computer or Google Colab. Petals is a community-run system that relies on people sharing their GPUs to increase its capacity and offer a distributed network for hosting model layers.

github

: 9.1k

LLaVA-pp

This repository, LLaVA++, extends the visual capabilities of the LLaVA 1.5 model by incorporating the latest LLMs, Phi-3 Mini Instruct 3.8B, and LLaMA-3 Instruct 8B. It provides various models for instruction-following LMMS and academic-task-oriented datasets, along with training scripts for Phi-3-V and LLaMA-3-V. The repository also includes installation instructions and acknowledgments to related open-source contributions.

github

: 499

KULLM

KULLM (구름) is a Korean Large Language Model developed by Korea University NLP & AI Lab and HIAI Research Institute. It is based on the upstage/SOLAR-10.7B-v1.0 model and has been fine-tuned for instruction. The model has been trained on 8×A100 GPUs and is capable of generating responses in Korean language. KULLM exhibits hallucination and repetition phenomena due to its decoding strategy. Users should be cautious as the model may produce inaccurate or harmful results. Performance may vary in benchmarks without a fixed system prompt.

github

: 527

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 980

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.9k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 32.1k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675

mflux

README:

About

Table of contents

Philosophy

💿 Installation

🖼️ Generating an image

📜 Full list of Command-Line Arguments

📜 In-Context LoRA Command-Line Arguments

📜 ControlNet Command-Line Arguments

📜 Batch Image Generation Arguments

📜 Training Arguments

How configs are used

Config schema

Example

⏱️ Image generation speed (updated)

↔️ Equivalent to Diffusers implementation

🗜️ Quantization

📊 Size comparisons for quantized models

💾 Saving a quantized version to disk

💽 Loading and running a quantized version from disk

💽 Running a non-quantized model directly from disk

🌐 Third-Party HuggingFace Model Support

🎨 Image-to-Image

🔌 LoRA

Multi-LoRA

Supported LoRA formats (updated)

🎭 In-Context LoRA

Available Styles

How It Works

Prompting for In-Context LoRA

Tips for Best Results

🛠️ Flux Tools

🖌️ Fill

Inpainting

Creating Masks

Example

Outpainting

Example

Tips for Best Results

🎛️ Dreambooth fine-tuning

Training configuration

Training example

Resuming a training run

Configuration details

Memory issues

Misc

🕹️ Controlnet

🚧 Current limitations

Optional Tool: Batch Image Renamer

💡Workflow Tips

✅ TODO

🔬 Cool research / features to support

🌱‍ Related projects

License

For Tasks:

For Jobs:

Alternative AI tools for mflux

Similar Open Source Tools

mflux

shellChatGPT

mergekit

generative-models

LeanCopilot

rtdl-num-embeddings

web-llm

py-vectara-agentic

ell

storm

garak

WindowsAgentArena

HuggingFaceGuidedTourForMac

allms

kvpress

datadreamer

For similar tasks

mflux

mindsdb

training-operator

helix