Skip to content

Configuration

s1dlx edited this page Jun 14, 2023 · 24 revisions

Folders structure

We use hydra library to handle config files, see their docs for a comprehensive description of how it works. Here we explain the bare minimum setup to get you going with the extension.

Your extensions/sd-webui-bayesian-merger/ folder is organised as follows

├── README.md
├── bayesian_merger.py
├── conf/...
├── install.py
├── logs/...
├── models/...
├── requirements.txt
├── sd_webui_bayesian_merger/...
├── tests/...
└── wildcards/...

In this page we focus on conf/ folder and its content:

├── conf
│   ├── config.tmpl.yaml
│   ├── optimisation_guide
│   │   └── guide.tmpl.yaml
│   └── payloads
│       ├── cargo
│       │   └── payload.tmpl.yaml
│       └── cargo.tmpl.yaml

As you can see there are four .tmpl.yaml files in a nested folder structure. You need to copy and rename the three of them in the following way:

  • config.tmpl.yaml -> config.yaml
  • cargo.tmpl.yaml -> cargo.yaml
  • payload.tmpl.yaml -> payload.yaml
  • guide.tmpl.yaml -> guide.yaml

resulting in

├── conf
│   ├── config.tmpl.yaml
│   ├── config.yaml
│   ├── optimisation_guide
│   │   ├── guide.tmpl.yaml
│   │   └── guide.yaml
│   └── payloads
│       ├── cargo
│       │   ├── payload.tmpl.yaml
│       │   └── payload.yaml
│       ├── cargo.tmpl.yaml
│       └── cargo.yaml

Let's have a look at each of them

config.yaml

defaults

The file begins with a defaults section

defaults:
  - _self_
  - payloads: cargo
  - optimisation_guide: guide

These will be the same for all the users, no need to change anything.

run_name

run_name: ${optimiser}_${scorer_method}
hydra:
  run:
    dir: logs/${now:%Y-%m-%d_%H-%M-%S}_${run_name}

run_name can be anything you want, even empty. By default this is set so that the optimiser name and the scorer method are concatenated. This is used to create a sub-folder to contain all your results in logs directory. Change it as you like (see here for an explanation of ${...} templating).

url

url: http://127.0.0.1:7860

this is the url to connect to webui API, the one above is the default one when launching webui with --api flag. In case you use --nowebui, change that to http://127.0.0.1:7861.

device

device: cpu
work_device: cpu
threads: 1

This is where the script will run, we suggest to leave it to cpu so that VMEM is free to be used for generations. In any case, you can set it to gpu (or cuda) and use your GPU VMEM. work_device is the one that will be used for merging only operations, device is the generic one. In case you do merging on cpu, you can speed up calculations by multithreading the process, i.e., increase the number of threads. When using gpu, leave threads: 1.

*_dir

wildcards_dir: path/to/wildcards/folder

This extension re-implements the wildcard extension for various reasons you do not need to care of. As a result, if you want to use wildcards in your prompts, you need to point tell the extension where to find them.


scorer_model_dir: path/where/to/save/scorer/models

This is where you want the aesthetic scorer models to be downloaded and stored.

model_*

model_a: path/to/model_a
model_b: path/to/model_b
model_c: path/to/model_c
merge_mode: weigthed_sum
weights_clip: False
rebasin: False
rebasin_iterations: 1

Where to find the two/three models to merge and how to do it. For merging we use an external library, sd-meh. Read more about merge_modes, rebasin and weights_clip in sd-meh Wiki.

prune

prune: False

When merging on gpu you may want to save VRAM by enabling pruning. This strips off all the model parts which are not important for merging. At the end of the merge the model is rebuilt so that you can generate images as nothing happened.

batch_size

batch_size: 1

How many images to generate per prompt.

optimiser

optimiser: bayes # tpe
init_points: 1
n_iters: 1

Here you can select an optimiser, the number of warmup/exploration points (init_points) and number of optimisation/exploitation points (n_iters).

sampling

latin_hypercube_sampling: False  # bayes optimiser only

By default (when latin_hypercube_sampling: False) we randomly sample a uniform distribution for each mergin parameter. The idea is that on the (very) long run we'll cover the entire search space in a uniform way. However, when we sample more than one parameter at the same time, it's difficult to sample all the combinations by random chance. For example, say we have two parameters p1 and p2, and we sample 10 points (p1, p2) with a random uniform distribution over [0, 1]

10random

The search space is inside the dashed lines and the we can see how the distribution of points is not even across the two dimensions. Let's now sample 100 points

100random

The distributions are better but not flat yet. We should go up to 1000 samples to get that kind of distribution

1krandom

This is where the latin hypercube sampling (LHS) algorithm helps. LHS ensures an uniform coverage of the search space even for a small sample size. Let's compare LHS vs random sampling at sample size 10

10b

Already LHS distributions look more uniform than random sampling. Even better when increasing the sample size to 100

100b

guided_optimisation

bounds_transformer: False # works only with bayes optiser
guided_optimisation: False

bounds_transformer is a feature of bayes optimiser. This will transform the optimisation boundaries during the run. That is, it will reduce the search space and (hopefully) speed up convergence. You can read more about bounds_transformer here.

guided_optimisation is available for both bayes and tpe optimisers. This is a manual override for defining the optimisation search space. When guided_optimisation = True, you also need to fill in the conf/optimisation_guide/guide.yaml config. Read more about this in Guided Optimisation page.

save_imgs

save_imgs: False

Whether to save generated images or not

scorer_*

scorer_method: chad # laion, manual

Pick an automatic scoring method, either chad or laion. We also have a manual method. This will prompt you to score images one-by-one. It may be tedious, but at least you are sure that the scores reflect your taste. Have a go with it!

save_best

save_best: False
best_format: safetensors # ckpt
best_precision: 16 # 32

Whether to save the best merged model (at the end of the optimisaiton run) or not. In case that is set to True, you can also pick the model format and precision to be saved in

draw_*

draw_unet_weights: False
draw_unet_base_alpha: False

These can be used to skip optimisation and draw only the UNET visualisation.

cargo.yaml

This file defines the default image generation options. Again, the file begins with a defaults (naming is quite confusing I know) section

defaults:
  - cargo:
    - payload

Here you can tell the extension which payloads you want to generate images with. In this example we have only one, but we can make as many as we want. For example

defaults:
  - cargo:
    - dog
    - cat
    - horse

In this case we'll ask webui to generate image(s) for each of dog, cat and horse payloads (more in payload.yaml section).


The following are all the webui options. The values you set here will be used by all the payloads. One thing to remember is these values will be also overridden by payload specific ones, e.g., there's no point in having a global prompt but you may use neg_prompt below to avoid retyping it several times.

prompt: ""
negative_prompt: ""

score_weight: 1.0

n_iter: 1
batch_size: 1
steps: 20
cfg_scale: 7
width: 512
height: 512
sampler_name: Euler
sampler_index: Euler

seed: -1
subseed: -1
subseed_strength: 0
seed_resize_from_h: -1
seed_resize_from_w: -1

enable_hr: false
denoising_strength: 0
firstphase_width: 0
firstphase_height: 0
hr_scale: 2
hr_upscaler: ""
hr_second_pass_steps: 0
hr_resize_x: 0
hr_resize_y: 0

styles: []

restore_faces: false
tiling: false

eta: 0
s_churn: 0
s_tmax: 0
s_tmin: 0
s_noise: 1

payload.yaml

As mentioned before you can have as many payloads in conf/payloads/cargo/ folder. These are structured as follows

payloadname:
  parameter1: value
  parameter2: value
  ...

where the parameters can be any from cargo.yaml file. One thing you need to do is to change payloadname to something different. For example, for our dog payload, we'll have a conf/payloads/cargo/dog.yaml file reading:

dog:
  score_weight: 1.0
  prompt: "a drawing of a dog"
  negative_prompt: "3d"
  steps: 30
  cfg: 6
  width: 512
  height: 768
  sampler_name: "Euler a"

Note how only few parameters are explicitly set, all the others will take default values from cargo.yaml file. Of particular interest is score_weight: 1.0 in this file. This is a parameter we can use to (wait for it...) weight the score for all the images generated with this prompt. This can help in definying a hierarchy of concepts you want the merge to be optimised for.

Let's talk about batch_size

You may have noticed that batch_size is defined twice in our configs: in config.yaml and in cargo.yaml (or if you want in each payload .yaml file). This is not a mistake, but a quirk of how the extension works. Put simply

  • batch_size in config.yaml (let's call it bayesian-batch_size here) rules how many times your prompt is rendered. In case you use wildcards, this will randomise them bayesian-batch_size times. When not using wildcards, the prompt will be always the same. Note that multiple images will be generated by separate api calls, not at the same time as when setting batch_size>1 in the webui (webui-batch_size). Thus, bayesian-batch_size = 100 will not crash you GPU as webui-batch_size = 100 may do.
  • batch_size in cargo.yaml is the actual webui batch_size (webui-batch_size). Note that this will not render different prompts when using wildcards (this is the quirk I was referring to), but it will generate multiple images in one call. This may be faster if you can afford enough VRAM.