Jiafu Wu, Xiaobin Hu, Yanjie Pan, Zhenye Gan, Mingmin Chi, Bo Peng, Yabiao Wang
Fantastic results of our proposed UniCombine on multi-conditional controllable generation:
- (a) Subject-Insertion task.
- (b) and (c) Subject-Spatial task.
- (d) Multi-Spatial task.
Our unified framework effectively handles any combination of input conditions and achieves remarkable alignment with all of them, including but not limited to text prompts, spatial maps, and subject images.
- ✅ March 12, 2025. We release SubjectSpatial200K dataset.
- ✅ March 12, 2025. We release UniCombine framework.
conda create -n unicombine python=3.12
conda activate unicombine
pip install -r requirements.txt
Due to the issues of diffusers library, you need to update the cite-package
code manually.
You can find the location of your diffusers library by running the following command.
pip show diffusers
Then add the following entry to the dictionary _SET_ADAPTER_SCALE_FN_MAPPING
located in diffusers/loaders/peft.py
:
"UniCombineTransformer2DModel": lambda model_cls, weights: weights
Place all the model weights in the ckpt
directory. Of course, it's also acceptable to store them in other directories.
- FLUX.1-schnell
huggingface-cli download black-forest-labs/FLUX.1-schnell --local-dir ./ckpt/FLUX.1-schnell
- Condition-LoRA
huggingface-cli download Xuan-World/UniCombine --include "Condition_LoRA/*" --local-dir ./ckpt
- Denoising-LoRA
huggingface-cli download Xuan-World/UniCombine --include "Denoising_LoRA/*" --local-dir ./ckpt
- FLUX.1-schnell-training-assistant-LoRA (optional)
Download it if you want to train your LoRA on the FLUX-schnell.
huggingface-cli download ostris/FLUX.1-schnell-training-adapter --local-dir ./ckpt/FLUX.1-schnell-training-adapter
Schnell is a step distilled model, meaning it can generate an image in just a few steps. However, this makes it impossible to train on it directly because every step you train breaks down the compression more and more. With this adapter enabled during training, that doesnt happen. It is activated during the training process, and disabled during sampling. After the LoRA is trained, this adapter is no longer needed.
- We provide the
inference.py
script to offer a simplest and fastest way for you to run our model. - Replace the arguments
--version
fromtraining-based
totraining-free
, then you don't need to provide the Denoising-LoRA module. - Adjust the scale of
--denoising_lora_weight
to get a balance between the editability and the consistency when using Custom Prompts. - News! We now provide the Gradio APP for you. Run the
app.py
and try it out!
Default Prompts:
python inference.py \
--condition_types fill subject \
--denoising_lora_name subject_fill_union \
--denoising_lora_weight 1.0 \
--fill examples/window/background.jpg \
--subject examples/window/subject.jpg \
--json "examples/window/1634_rank0_A decorative fabric topper for windows..json" \
--version training-based
Default Prompts:
python inference.py \
--condition_types canny subject \
--denoising_lora_name subject_canny_union \
--denoising_lora_weight 1.0 \
--canny examples/doll/canny.jpg \
--subject examples/doll/subject.jpg \
--json "examples/doll/1116_rank0_A spooky themed gothic doll..json" \
--version training-based
Custom Prompts:
python inference.py \
--condition_types canny subject \
--denoising_lora_name subject_canny_union \
--denoising_lora_weight 0.6 \
--canny examples/doll/canny.jpg \
--subject examples/doll/subject.jpg \
--json "examples/doll/1116_rank0_A spooky themed gothic doll..json" \
--version training-based \
--prompt "She stands amidst the vibrant glow of a bustling Chinatown alley, \
her pink hair shimmering under festive lantern light, clad in a sleek black dress adorned with intricate lace patterns. "
Default Prompts:
python inference.py \
--condition_types depth subject \
--denoising_lora_name subject_depth_union \
--denoising_lora_weight 1.0 \
--depth examples/car/depth.jpg \
--subject examples/car/subject.jpg \
--json "examples/car/2532_rank0_A sturdy ATV with rugged looks..json" \
--version training-based
Custom Prompts:
python inference.py \
--condition_types depth subject \
--denoising_lora_name subject_depth_union \
--denoising_lora_weight 0.6 \
--depth examples/car/depth.jpg \
--subject examples/car/subject.jpg \
--json "examples/car/2532_rank0_A sturdy ATV with rugged looks..json" \
--version training-based \
--prompt "It is positioned on a snow-covered path in a forest, its green body dusted with frost and black tires caked with packed snow. \
The vehicle retains its sturdy build with handlebars glinting ice particles and headlights cutting through falling snowflakes, surrounded by tall pine trees draped in white."
Default Prompts:
python inference.py \
--condition_types depth canny \
--denoising_lora_name depth_canny_union \
--denoising_lora_weight 1.0 \
--depth examples/toy/depth.jpg \
--canny examples/toy/canny.jpg \
--json "examples/toy/1616_rank0_A soft, plush toy with cuddly features..json" \
--version training-based
Custom Prompts:
python inference.py \
--condition_types depth canny \
--denoising_lora_name depth_canny_union \
--denoising_lora_weight 0.6 \
--depth examples/toy/depth.jpg \
--canny examples/toy/canny.jpg \
--json "examples/toy/1616_rank0_A soft, plush toy with cuddly features..json" \
--version training-based \
--prompt "It sits on a moonlit sandy beach, a small sandcastle partially washed by gentle tides beside it, \
under a night sky where the full moon casts silvery trails across waves, with distant seagulls gliding through star-dappled darkness."
- Download SubjectSpatial200K
Place our SubjectSpatial200K dataset in the dataset
directory. Of course, it's also acceptable to store them in other directories.
huggingface-cli download Xuan-World/SubjectSpatial200K --repo-type dataset --local-dir ./dataset
- Filter and Partition the SubjectSpatial200K dataset into training and testing sets.
The default partition scheme is identical to our paper. You can customize it as you wish.
python src/partition_dataset.py \
--dataset dataset/SubjectSpatial200K/data_labeled \
--output_dir dataset/split_SubjectSpatial200K \
--partition train
python src/partition_dataset.py \
--dataset dataset/SubjectSpatial200K/Collection3/data_labeled \
--output_dir dataset/split_SubjectSpatial200K/Collection3 \
--partition train
python src/partition_dataset.py \
--dataset dataset/SubjectSpatial200K/data_labeled \
--output_dir dataset/split_SubjectSpatial200K \
--partition test
python src/partition_dataset.py \
--dataset dataset/SubjectSpatial200K/Collection3/data_labeled \
--output_dir dataset/split_SubjectSpatial200K/Collection3 \
--partition test
You can train custom Condition-LoRA models using the script in demo_Condition_LoRA
. This simple process lets you develop new Condition-LoRA modules that extend UniCombine's features.
Here we provide you an operation process to train a Style-LoRA. It won't be time-consuming because the training dataset is only about 10K images.
- Download the style transfer dataset proposed by StyleBooth.
https://modelscope.cn/models/iic/stylebooth/files
It is worth noting that we fix the format errors of
train.csv
in the StyleBooth Dataset and provide you the fixed one atdemo_Condition_Lora/annotation.csv
.
-
Write a custom Dataset Class like
demo_Condition_Lora/stylebooth_loader.py
-
Run
demo_Condition_Lora/train_cond_lora.py
.
Use our SubjectSpatial200K dataset or your customized multi-conditional dataset to train your Denoising-LoRA module.
- Configure Accelerate Environment
accelerate config
- Launch Distributed Training
accelerate launch train.py
- We provide a script for batch inference on the SubjectSpatial200K dataset in both training-free and training-based version.
- It can also be run on your custom datasets through your Dataset and DataLoader implementations.
python test.py
@article{wang2025unicombine,
title={UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer},
author={Wang, Haoxuan and Peng, Jinlong and He, Qingdong and Yang, Hao and Jin, Ying and Wu, Jiafu and Hu, Xiaobin and Pan, Yanjie and Gan, Zhenye and Chi, Mingmin and others},
journal={arXiv preprint arXiv:2503.09277},
year={2025}
}