-
Notifications
You must be signed in to change notification settings - Fork 18
Configuration
We use hydra library to handle config files, see their docs for a comprehensive description of how it works. Here we explain the bare minimum setup to get you going with the extension.
Your extensions/sd-webui-bayesian-merger/
folder is organised as follows
├── README.md
├── bayesian_merger.py
├── conf/...
├── install.py
├── logs/...
├── models/...
├── requirements.txt
├── sd_webui_bayesian_merger/...
├── tests/...
└── wildcards/...
In this page we focus on conf/
folder and its content:
├── conf
│ ├── config.tmpl.yaml
│ ├── optimisation_guide
│ │ └── guide.tmpl.yaml
│ └── payloads
│ ├── cargo
│ │ └── payload.tmpl.yaml
│ └── cargo.tmpl.yaml
As you can see there are four .tmpl.yaml
files in a nested folder structure. You need to copy and rename the three of them in the following way:
-
config.tmpl.yaml
->config.yaml
-
cargo.tmpl.yaml
->cargo.yaml
-
payload.tmpl.yaml
->payload.yaml
-
guide.tmpl.yaml
->guide.yaml
resulting in
├── conf
│ ├── config.tmpl.yaml
│ ├── config.yaml
│ ├── optimisation_guide
│ │ ├── guide.tmpl.yaml
│ │ └── guide.yaml
│ └── payloads
│ ├── cargo
│ │ ├── payload.tmpl.yaml
│ │ └── payload.yaml
│ ├── cargo.tmpl.yaml
│ └── cargo.yaml
Let's have a look at each of them
The file begins with a defaults
section
defaults:
- _self_
- payloads: cargo
- optimisation_guide: guide
These will be the same for all the users, no need to change anything.
run_name: ${optimiser}_${scorer_method}
hydra:
run:
dir: logs/${now:%Y-%m-%d_%H-%M-%S}_${run_name}
run_name
can be anything you want, even empty. By default this is set so that the optimiser name and the scorer method are concatenated. This is used to create a sub-folder to contain all your results in logs
directory. Change it as you like (see here for an explanation of ${...}
templating).
url: http://127.0.0.1:7860
this is the url to connect to webui API, the one above is the default one when launching webui with --api
flag. In case you use --nowebui
, change that to http://127.0.0.1:7861
.
device: cpu
work_device: cpu
threads: 1
This is where the script will run, we suggest to leave it to cpu
so that VMEM is free to be used for generations. In any case, you can set it to gpu
(or cuda
) and use your GPU VMEM. work_device
is the one that will be used for merging only operations, device
is the generic one. In case you do merging on cpu
, you can speed up calculations by multithreading the process, i.e., increase the number of threads
. When using gpu, leave threads: 1
.
wildcards_dir: path/to/wildcards/folder
This extension re-implements the wildcard extension for various reasons you do not need to care of. As a result, if you want to use wildcards in your prompts, you need to point tell the extension where to find them.
scorer_model_dir: path/where/to/save/scorer/models
This is where you want the aesthetic scorer models to be downloaded and stored.
model_a: path/to/model_a
model_b: path/to/model_b
model_c: path/to/model_c
merge_mode: weigthed_sum
weights_clip: False
rebasin: False
rebasin_iterations: 1
Where to find the two/three models to merge and how to do it. For merging we use an external library, sd-meh. Read more about merge_mode
s, rebasin
and weights_clip
in sd-meh Wiki.
prune: False
When merging on gpu you may want to save VRAM by enabling pruning. This strips off all the model parts which are not important for merging. At the end of the merge the model is rebuilt so that you can generate images as nothing happened.
batch_size: 1
How many images to generate per prompt.
optimiser: bayes # tpe
init_points: 1
n_iters: 1
Here you can select an optimiser, the number of warmup/exploration points (init_points
) and number of optimisation/exploitation points (n_iters
).
latin_hypercube_sampling: False # bayes optimiser only
By default (when latin_hypercube_sampling: False
) we randomly sample a uniform distribution for each mergin parameter. The idea is that on the (very) long run we'll cover the entire search space in a uniform way. However, when we sample more than one parameter at the same time, it's difficult to sample all the combinations by random chance. For example, say we have two parameters p1
and p2
, and we sample 10 points (p1, p2)
with a random uniform distribution over [0, 1]
The search space is inside the dashed lines and the we can see how the distribution of points is not even across the two dimensions. Let's now sample 100 points
The distributions are better but not flat yet. We should go up to 1000 samples to get that kind of distribution
This is where the latin hypercube sampling (LHS) algorithm helps. LHS ensures an uniform coverage of the search space even for a small sample size. Let's compare LHS vs random sampling at sample size 10
Already LHS distributions look more uniform than random sampling. Even better when increasing the sample size to 100
bounds_transformer: False # works only with bayes optiser
guided_optimisation: False
bounds_transformer
is a feature of bayes
optimiser. This will transform the optimisation boundaries during the run. That is, it will reduce the search space and (hopefully) speed up convergence. You can read more about bounds_transformer
here.
guided_optimisation
is available for both bayes
and tpe
optimisers. This is a manual override for defining the optimisation search space. When guided_optimisation = True
, you also need to fill in the conf/optimisation_guide/guide.yaml
config. Read more about this in Guided Optimisation page.
save_imgs: False
Whether to save generated images or not
scorer_method: chad # laion, manual
Pick an automatic scoring method, either chad
or laion
. We also have a manual
method. This will prompt you to score images one-by-one. It may be tedious, but at least you are sure that the scores reflect your taste. Have a go with it!
save_best: False
best_format: safetensors # ckpt
best_precision: 16 # 32
Whether to save the best merged model (at the end of the optimisaiton run) or not. In case that is set to True
, you can also pick the model format and precision to be saved in
draw_unet_weights: False
draw_unet_base_alpha: False
These can be used to skip optimisation and draw only the UNET visualisation.
This file defines the default image generation options. Again, the file begins with a defaults
(naming is quite confusing I know) section
defaults:
- cargo:
- payload
Here you can tell the extension which payloads you want to generate images with. In this example we have only one, but we can make as many as we want. For example
defaults:
- cargo:
- dog
- cat
- horse
In this case we'll ask webui to generate image(s) for each of dog
, cat
and horse
payloads (more in payload.yaml
section).
The following are all the webui options. The values you set here will be used by all the payloads. One thing to remember is these values will be also overridden by payload specific ones, e.g., there's no point in having a global prompt
but you may use neg_prompt
below to avoid retyping it several times.
prompt: ""
negative_prompt: ""
score_weight: 1.0
n_iter: 1
batch_size: 1
steps: 20
cfg_scale: 7
width: 512
height: 512
sampler_name: Euler
sampler_index: Euler
seed: -1
subseed: -1
subseed_strength: 0
seed_resize_from_h: -1
seed_resize_from_w: -1
enable_hr: false
denoising_strength: 0
firstphase_width: 0
firstphase_height: 0
hr_scale: 2
hr_upscaler: ""
hr_second_pass_steps: 0
hr_resize_x: 0
hr_resize_y: 0
styles: []
restore_faces: false
tiling: false
eta: 0
s_churn: 0
s_tmax: 0
s_tmin: 0
s_noise: 1
As mentioned before you can have as many payloads in conf/payloads/cargo/
folder. These are structured as follows
payloadname:
parameter1: value
parameter2: value
...
where the parameter
s can be any from cargo.yaml
file. One thing you need to do is to change payloadname
to something different. For example, for our dog
payload, we'll have a conf/payloads/cargo/dog.yaml
file reading:
dog:
score_weight: 1.0
prompt: "a drawing of a dog"
negative_prompt: "3d"
steps: 30
cfg: 6
width: 512
height: 768
sampler_name: "Euler a"
Note how only few parameters are explicitly set, all the others will take default values from cargo.yaml
file. Of particular interest is score_weight: 1.0
in this file. This is a parameter we can use to (wait for it...) weight the score for all the images generated with this prompt. This can help in definying a hierarchy of concepts you want the merge to be optimised for.
You may have noticed that batch_size
is defined twice in our configs: in config.yaml
and in cargo.yaml
(or if you want in each payload .yaml
file). This is not a mistake, but a quirk of how the extension works. Put simply
-
batch_size
inconfig.yaml
(let's call itbayesian-batch_size
here) rules how many times your prompt is rendered. In case you use wildcards, this will randomise thembayesian-batch_size
times. When not using wildcards, the prompt will be always the same. Note that multiple images will be generated by separate api calls, not at the same time as when settingbatch_size
>1 in the webui (webui-batch_size
). Thus,bayesian-batch_size = 100
will not crash you GPU aswebui-batch_size = 100
may do. -
batch_size
incargo.yaml
is the actual webuibatch_size
(webui-batch_size
). Note that this will not render different prompts when using wildcards (this is the quirk I was referring to), but it will generate multiple images in one call. This may be faster if you can afford enough VRAM.