Something wrong with img2img samplers #889

netsvetaev · 2022-10-02T20:04:38Z

Describe your environment

mbp m1pro
Branch: development

Describe the bug
I’m trying new samplers with img2img (klms & keuler), but always get strange result.
I use a classic command from https://github.com/invoke-ai/InvokeAI/blob/main/docs/features/IMG2IMG.md, just with «-A klms» for example.

strength 0.5

strength 0.3

Originals:

lstein · 2022-10-02T22:53:01Z

I just confirmed that img2img with the plms and k* samplers is working as expected on a CUDA system. This may be an M1-specific issue, although I'd be surprised if this was the case. Could you post the original images and the exact prompts you used?

psychedelicious · 2022-10-03T06:45:53Z

I'm getting the same weird washed out and blurry images at lower img2img strengths. Also on M1.

@netsvetaev 's original images are the last two images they posted btw.

Here's my init image:

And here's k_lms, k_euler, and ddim each at 32 steps and -f 0.25, 0.5, 0.75.

"photograph of a tree on a hill with a river" -s 32 -S 12345 -W 512 -H 512 -C 7.5 -I /Users/spencer/Documents/Code/initimage.png -A k_euler -f 0.25

"photograph of a tree on a hill with a river" -s 32 -S 12345 -W 512 -H 512 -C 7.5 -I /Users/spencer/Documents/Code/initimage.png -A k_euler -f 0.5

"photograph of a tree on a hill with a river" -s 32 -S 12345 -W 512 -H 512 -C 7.5 -I /Users/spencer/Documents/Code/initimage.png -A k_euler -f 0.75

"photograph of a tree on a hill with a river" -s 32 -S 12345 -W 512 -H 512 -C 7.5 -I /Users/spencer/Documents/Code/initimage.png -A k_lms -f 0.25

"photograph of a tree on a hill with a river" -s 32 -S 12345 -W 512 -H 512 -C 7.5 -I /Users/spencer/Documents/Code/initimage.png -A k_lms -f 0.5

"photograph of a tree on a hill with a river" -s 32 -S 12345 -W 512 -H 512 -C 7.5 -I /Users/spencer/Documents/Code/initimage.png -A k_lms -f 0.75

"photograph of a tree on a hill with a river" -s 32 -S 12345 -W 512 -H 512 -C 7.5 -I /Users/spencer/Documents/Code/initimage.png -A ddim -f 0.25

"photograph of a tree on a hill with a river" -s 32 -S 12345 -W 512 -H 512 -C 7.5 -I /Users/spencer/Documents/Code/initimage.png -A ddim -f 0.5

"photograph of a tree on a hill with a river" -s 32 -S 12345 -W 512 -H 512 -C 7.5 -I /Users/spencer/Documents/Code/initimage.png -A ddim -f 0.75

ranhalprin · 2022-10-03T17:08:42Z

I notice that the sampler only runs for one step (regardless of input), this is probably the reason for the blurry results

psychedelicious · 2022-10-03T22:02:45Z

I notice that the sampler only runs for one step (regardless of input), this is probably the reason for the blurry results

I'm not sure that's accurate - I get very different results when I specify 1 step vs any other number:

1 step "photograph of a tree on a hill with a river" -s 1 -S 12345 -W 512 -H 512 -C 7.5 -I /Users/spencer/Documents/Code/initimage.png -A k_euler -f 0.25

32 steps "photograph of a tree on a hill with a river" -s 32 -S 12345 -W 512 -H 512 -C 7.5 -I /Users/spencer/Documents/Code/initimage.png -A k_euler -f 0.25

It's possible you are using a low number of steps - with img2img, the number of actual steps is a function of strength. So you specify 32 steps at strength 0.25, you get floor((32 - 1) * 0.25) --> 7 steps. The result is that for certain combinatoins of low step count and strength, it only does one step.

lstein · 2022-10-04T00:17:54Z

This seems to be an MPS-related bug. I don’t have a Mac to test on, but I’ll investigate any suspicious “if MPS” statements. Maybe @mh-dm or @Any-Winter-4079 could have a look?

mh-dm · 2022-10-04T01:21:12Z

I don't have an MPS device to test on.

psychedelicious · 2022-10-04T05:33:40Z

@lstein I'm not convinced it is MPS-related. I have the exact same behavior after forcing my torch device to be cpu:

in ldm/dream/devices.py:

def choose_torch_device() -> str:
    '''Convenience routine for guessing which GPU device to run model on'''
    # if torch.cuda.is_available():
    #     return 'cuda'
    # if hasattr(torch.backends, 'mps') and torch.backends.mps.is_available():
    #     return 'mps'
    return 'cpu'

in ldm/generate.py:

def fix_func(orig):
    # if hasattr(torch.backends, 'mps') and torch.backends.mps.is_available():
    #     def new_func(*args, **kw):
    #         device = kw.get("device", "mps")
    #         kw["device"]="cpu"
    #         return orig(*args, **kw).to(device)
    #     return new_func
    return orig

Each step now takes about 4x as long, and there are no messages about mismatched torch devices, so I think I've forced cpu successfully...

damian0815 · 2022-10-04T06:41:59Z

I have been writing some documentation for img2img, which involved visualizing the latent space, and i noticed that when the step count does not scale with strength. if you request 100 steps with a strength of 0.3, SD will only actually do 30 steps of inference. with a strength of 0.6 it will 60. not sure if this is intended behaviour, a bug, or an oversight in the UI design, but since it seems to be surprising i'd suggest adjusting the step count to compensate for strength when doing img2img.

psychedelicious · 2022-10-04T07:06:52Z

Ok, there is something wrong on a deeper level. I have reverted the force-cpu changes to my local branch and run some tests with strength 0.75 and 0.85, each running from steps 15 to ~40. Sampler DDIM.

My REPL command:

"photograph of a tree on a hill with a river" -S 12345 -W 512 -H 512 -C 7.5 -I /Users/spencer/Documents/Code/initimage.png -A ddim -f {strength} -s {steps}

Using this same init image:

Strength 0.75 images (named the zip f7.5 but its f0.75):
f7.5.zip

Strength 0.85 images (named the zip f8.5 but its f0.85):
f8.5.zip

Images are prefixed by the steps count entered in the REPL (not the actual steps).

Have a scroll through those and you will notice that there are some interesting patterns. There seem to be 2 types of result which I'll call the "common" type - very close to the init image - and the "occasional" type - which diverges more. You get 4 or 5 "common" types, then a couple "occasional" types, then 4 or 5 "common" types, then a couple "occasional" types...

It's far more obvious on the f0.85 images which are "common" and which are "occasional".

Also, it seems like the more steps, the closer the "common" images are to the init image. I would have expected the images to converge but not on something very much like the original...

Here's f0.85 s200:

And here's f0.85 s201:

That ain't right!

damian0815 · 2022-10-04T07:20:54Z

@psychedelicious you might want to try my branch document-img2img which spits out pngs for each intermediate step. check the outputs/img_samples/intermediates folder, which may help understand what's going on (this is what i'm documenting!)

damian0815 · 2022-10-04T07:24:43Z

this is actually a subtle thing that i haven't seen any of the SD guis communicate well. because of the way the SD algorithm works (actually diffusion algorithms more generally), doing steps=50 is not equivalent to doing steps=49 then feeding the 49th image in for "one more" step. if the step count is different then the amount of denoising that happens at each step is different, all the way back to the first step. so your difference between s200 and s201 could be emerging actually from a change that happened on step 2 or 3 - two pixels get interpreted slightly differently and then 200 iterations later your spindly tree has become a bushy tree somewhere else.

the best way to get a handle on this, i found, was to try low step counts and look at the latents at each step. if there's interest i can clean up my intermediate writer and submit it as a feature..?

psychedelicious · 2022-10-04T07:40:07Z

@damian0815 Thanks, I don't understand what is really going on internally - thanks for explaining that. Do I understand you correct, that the patterns I am seeing are to be expected?

I checked out your branch, but it seems to be missing from scripts.modules.preview_decoder import ApproximateDecoder.

I discovered by trial and error that, in the step_callback, calling sample_to_image(sample) doesn't give you the intermediate image. Would love to have access to your intermediate writer. Thanks.

damian0815 · 2022-10-04T08:33:24Z

yep that's the problem, see how we've already diverged enough to see with our eyes alone after about the 3rd or 4th step. top is 20 steps (-s 30 -f 0.7), bottom is 19 (-s 29 -f 0.7).

@psychedelicious i've fixed the issue on my branch - if you fetch again you can add --write_intermediates to the end of your dream> prompt and it will spit out all the latent steps to an intermediates folder under outputs/img_samples. please let me know how you get on with it!

& i'd love to hear your comments on my updated img2img docs! :)

Any-Winter-4079 · 2022-10-04T17:50:15Z

@psychedelicious @lstein
I can reproduce this
Input image

"a painting of * in the style of van gogh" -s 50 -S 514644559 -W 512 -H 512 -C 7.5 -I outputs/img-samples/002109.514644559.png -A ddim -f 0.05
Before:

Now:

However, with higher strength:
"a painting of * :1.8 in the style of van gogh" -s 50 -S 514644559 -W 512 -H 512 -C 7.5 -I outputs/img-samples/002109.514644559.png -A ddim -f 0.75
Before:

Now:

A lot closer / identical

Any-Winter-4079 · 2022-10-04T18:35:48Z

@damian0815 awesome guide! It would be great if you'd decide to merge it.

Any-Winter-4079 · 2022-10-04T18:43:35Z

@lstein a similar problem has been reported on cuda it seems #898
It may not be M1-specific

psychedelicious · 2022-10-04T19:46:08Z

Thanks @Any-Winter-4079 , I closed @hipsterusername's issue without noticing it was on a different architecture.

Think something got misplaced in the shuffle during @lstein 's recent changes...

Any-Winter-4079 · 2022-10-04T22:31:14Z

Original:

"photograph of a tree on a hill with a river" -s 32 -S 12345 -W 512 -H 512 -C 7.5 -I 193515615-921b5f80-a2f3-4351-9459-b8e447ea765d.png -A k_euler -f0.25
Now:

Before:

The color is a bit washed-out as you say, but I think it's gotten much better.

Any-Winter-4079 · 2022-10-04T22:54:38Z

This also produces the same result after #925 (btw, not sure how this affected ddim?)
"a painting of * in the style of van gogh" -s 50 -S 514644559 -W 512 -H 512 -C 7.5 -I outputs/img-samples/002109.514644559.png -A ddim -f 0.05
Now:

Before:

So it might be fixed now?

Any-Winter-4079 · 2022-10-04T22:55:15Z

@netsvetaev @damian0815 I leave it to you to test / check if it works on your end, or if you see some issue

Any-Winter-4079 · 2022-10-04T23:01:54Z

Oh, also about @psychedelicious finding, which may be an additional issue.

"photograph of a tree on a hill with a river" -s 200 -S 12345 -W 512 -H 512 -C 7.5 -I 193515615-921b5f80-a2f3-4351-9459-b8e447ea765d.png -A ddim -f 0.85

"photograph of a tree on a hill with a river" -s 201 -S 12345 -W 512 -H 512 -C 7.5 -I 193515615-921b5f80-a2f3-4351-9459-b8e447ea765d.png -A ddim -f 0.85

I can still reproduce this.

lstein · 2022-10-05T18:32:47Z

Using my CUDA system I've just compared the output of the img2img ddim sampler between the current code and a version from September 15, long before I made any changes to the samplers. The results are identical. I don't see any blurriness or color degradation:

"photograph of a tree on a hill with a river" -S 12345 -W 512 -H 512 -C 7.5 -I./test-pictures/193946000-c42a96d8-5a74-4f8a-b4c3-5213e6cadcce.png -Addim -f 0.85

I'm not expecting the images to be the same between M1 and CUDA. However, I find it alarming that I'm not getting anything that looks like what the prompt is asking for. This is something that I've noticed in passing with smaller init images on a couple of occasions. To check this out, I rescaled the init image to 512x512, applied the same prompt and other parameters, and voila!

Upping the strength and CFG to 0.8 and 15.0 respectively, gives me this:

Then tweaking the prompt a bit gives something more photorealistic:

"tree on a hill with a river, nature photograph, national geographic" -I ./test-pictures/tree-and-river.png -A ddim -f 0.8 -C15

So in summary, we've got multiple bugs:

On M1 systems, the images are getting washed out. Would you please check out e601163 and run the generations again to see if this is a regression that has happened recently?
On all systems, img2img is not working on images smaller than 512x512.

lstein · 2022-10-05T18:57:12Z

This also produces the same result after #925 (btw, not sure how this affected ddim?) "a painting of * in the style of van gogh" -s 50 -S 514644559 -W 512 -H 512 -C 7.5 -I outputs/img-samples/002109.514644559.png -A ddim -f 0.05 Now:

Before:

So it might be fixed now?

Just to confirm, the washed out issue is affecting the DDIM sampler? I did refactor a large amount of common code shared by ddim and plms, so it’s possible I broke something in a way that’s only manifested on M1. Has anyone tried a bisect to track down the offending commit?

Any-Winter-4079 · 2022-10-05T19:18:00Z

"photograph of a tree on a hill with a river" -s 32 -S 12345 -W 512 -H 512 -C 7.5 -I 193515615-921b5f80-a2f3-4351-9459-b8e447ea765d.png -A ddim -f0.25
DDIM

"photograph of a tree on a hill with a river" -s 32 -S 12345 -W 512 -H 512 -C 7.5 -I 193515615-921b5f80-a2f3-4351-9459-b8e447ea765d.png -A k_euler -f0.25
K samplers

DDIM is (now) fine.

Any-Winter-4079 · 2022-10-05T19:40:13Z

@lstein e601163
"photograph of a tree on a hill with a river" -s 32 -S 12345 -W 512 -H 512 -C 7.5 -I 193515615-921b5f80-a2f3-4351-9459-b8e447ea765d.png -A k_euler -f0.25
K samplers

lstein · 2022-10-05T19:59:42Z

Now that's very odd. The e601163 commit was from before I added support for the k* samplers. So presumably you generated a DDIM image here. I'd proposed the test in order to see if there was a regression on ddim.

I'm away from my system now, but as soon as I'm back I'll see if I can reproduce the color distortion on CUDA.

Any-Winter-4079 · 2022-10-05T20:36:58Z

Oh, well DDIM seems fine now in terms of washed-out effects (see #889 (comment) at -f0.25). No need to compare with old commits. For the K sampler, it may very well have defaulted to DDIM when I used e601163 commit and I may have missed it. Let me check again.

Update: yes, sampler 'KSampler' is not yet supported. Using DDIM sampler

lstein · 2022-10-05T21:11:16Z

It seems to be an M1 problem. On CUDA, here is what I get with k_lms, k_euler_a, k_euler, and k_heun respectively. I don't have any earlier k* img2img images to compare to, but they look pretty bright to me.

For comparison, here is the plms image:

lstein · 2022-10-06T15:50:01Z

I'm just about at my depth of understanding here (over my depth to be honest), but my understanding is that each of the sigmas represents the amount of noise to inject and denoise at each step. We can get away with removing the 0.0 at the end; this is just a placeholder that prevents an index error at the very last step (there is a call to sigmas[i+1] in each of the samplers). However, truncating sigmas more deeply is not a good option because it will leave us with a noisy image.

My current hypothesis is that there is something amiss with the step that occurs just before the noising/denoising loop starts. In this step, the latent image is noised with the value of sigmas at the first step:

if x_T is not None:
    x = x_T * sigmas[0]

(Here x_T is the latent image, and x is the noised latent image that will get passed to the denoising loop. sigmas[0] is the truncated sigmas that starts at the strength-specified intermediate step.)

If the wrong sigma index is being applied at this step, this would explain the behavior we're seeing. I briefly experimented with varying this step, but haven't explored it exhaustively. What bugs me is that this works fine in the plms and ddim samplers, so why should it change?

Birch-san · 2022-10-06T17:49:16Z

@lstein yes, img2img works fine with k-diffusion in my fork.
https://twitter.com/Birchlabs/status/1566557708089712641

Birch-san/stable-diffusion@7b42d40

low strengths work (they don't go blurry like yours), but so many high-sigma denoising steps are employed that very little of the original latents survive.

strength 0.3 is the lowest I've ever gotten coherent results from:

lstein · 2022-10-06T18:15:57Z

@Birch-san THANK YOU! When I reviewed your code I found the place I was going wrong. As I suspected, it was the step of adding noise to the init_latent.

I don't know what part of the world you live in, but if you're ever in Toronto swing by and I'll buy you a beer or three.

lstein · 2022-10-06T18:17:59Z

I just committed the fix to development. This uses the model's sigmas. I will see what results I get with Karras now.

lstein · 2022-10-06T18:36:32Z

Ah, I may have spoken too soon. There's too much noise being added at higher strengths and the image is replaced completely after about -f0.5. Anyway, it's on the right track.

I've reverted and continue to explore the problem.

Birch-san · 2022-10-06T19:27:14Z

I'm in the UK. always up for a beer 🍻

lstein · 2022-10-06T20:32:57Z

I'm in the UK. always up for a beer beers

I'll be in Cambridge this February. Are you in London?

Any-Winter-4079 · 2022-10-06T21:01:52Z

On top of

x = torch.randn([batch_size, *shape], device=self.device) * sigmas[0]
if x_T is not None:
       x = x_T + x

I tried using the full sigmas (doing as many steps as -s)
"photograph of a tree on a hill with a river" -s 50 -S 12345 -W 512 -H 512 -C 7.5 -I 193515615-921b5f80-a2f3-4351-9459-b8e447ea765d.png -A k_euler -f 0.25
but then the image

differs a bit too much from the original.

Note we are at -f0.25

Then, reading

sigmas is a full steps in length, but t_enc might
be less. We start in the middle of the sigma array
and work our way to the end after t_enc steps.

in p_sample made me think about sigmas in sample.
What if we only took the last -f * -s sigmas?
"photograph of a tree on a hill with a river" -S 12345 -W 512 -H 512 -C 7.5 -I 193515615-921b5f80-a2f3-4351-9459-b8e447ea765d.png -A k_euler -f0.25 -s40

"photograph of a tree on a hill with a river" -S 12345 -W 512 -H 512 -C 7.5 -I 193515615-921b5f80-a2f3-4351-9459-b8e447ea765d.png -A k_euler -f0.8 -s40

Not saying this is the answer, but it's the first time I'm getting somewhat good results (it's the first time that I recall on Mac that the output is not noisy or washed out).

Edit: actually, double-checking, I was leaving out the same sigmas in the last 2 images (removing 9 first sigmas), not keeping -s * -f

Any-Winter-4079 · 2022-10-06T21:21:49Z

Another example, at -f0.5
"photograph of a tree on a hill with a river" -S 12345 -W 512 -H 512 -C 7.5 -I 193515615-921b5f80-a2f3-4351-9459-b8e447ea765d.png -A k_euler -f0.5 -s40
With sigmas = self.sigmas[-S-1:]

With sigmas = self.sigmas[9:]

All of this on top of having

x = torch.randn([batch_size, *shape], device=self.device) * sigmas[0]
if x_T is not None:
       x = x_T + x

from 2c27e75

Without having that code

back to washed-out

Any-Winter-4079 · 2022-10-06T21:31:15Z

And here's another example, -f0.35

I'm close to calling it a success for me. The only thing is, there is probably a formula, better than leaving out the first 9 sigmas (which might be a bit too creative in terms of resulting outputs).

I'll try to compare this too with @Birch-san behavior for the same original image.
193515615-921b5f80-a2f3-4351-9459-b8e447ea765d.png.zip

@lstein let me know if you experience the same on cuda and washed out effect is gone with this (and if this is a bug, hope the experiments help you figure it out)

Birch-san · 2022-10-06T22:11:35Z

@lstein can discuss on Discord -- are you on the LAION server?

hipsterusername · 2022-10-06T22:13:05Z

@Birch-san - Invokes discord >> https://discord.gg/ZmtBAhwWhy

lstein · 2022-10-06T22:21:32Z

I got it working. The key was to remove the stochastic_encode() call. I will be committing in a minute.

lstein · 2022-10-06T22:25:11Z

"photograph of a tree on a hill with a river" -s 40 -S 12345 -W 512 -H 512 -C 7.5 -I ./test-pictures/river-and-mountain.png -A k_euler
-f0.3

-f0.75

-f0.9

lstein · 2022-10-06T22:28:23Z

I am just removing debugging code and will commit in a sec. @Any-Winter-4079 , please compare with your solution and let me know which one is working better.

UPDATE: pushed. The revised code is now in development and release-candidate-2

lstein · 2022-10-06T22:33:50Z

@Birch-san - Invokes discord >> https://discord.gg/ZmtBAhwWhy

I've got discord open at this link, but I'm such a noob I don't know how to rendezvous with you folk. What are your discord names?

Birch-san · 2022-10-06T22:34:44Z

have made contact

Any-Winter-4079 · 2022-10-07T11:58:11Z

@lstein 7541c7c looks good in general. The washed out effect is completely removed.

Playing with an old commit (2c27e75) I noticed that playing with sigmas you can increase/decrease the creativity (the more sigmas from the start you remove, the less creativity is has)

But my problem was I was removing a fixed amount of sigmas, regardless of steps and strength. So, removing a dynamic amount of sigmas (like in 7541c7c) to fix this problem, we can also obtain nice results:

sigmas = self.karras_sigmas[-S-1:] from 7541c7c vs. sigmas = self.karras_sigmas[-int(1.1*S)-1:] (basically removing a few less sigmas) vs. sigmas = self.karras_sigmas[-int(1.2*S)-1:]

I'm not saying one option should be preferred over the other (how much variation we should have at -f0.6, -f0.75... I'd say is a bit of a personal choice/preference).

Also, something I've seen is increasing for example 20% the number of steps in 7541c7c won't produce the same results as
sigmas = self.karras_sigmas[-int(1.2*S)-1:] despite both doing 28 true steps, because the sigmas are not the same.

sigmas = self.karras_sigmas[-S-1:] from 7541c7c -> 29 sigmas

tensor([1.9426, 1.7265, 1.5313, 1.3554, 1.1971, 1.0549, 0.9274, 0.8134, 0.7116,
        0.6209, 0.5403, 0.4689, 0.4057, 0.3499, 0.3009, 0.2578, 0.2202, 0.1874,
        0.1588, 0.1341, 0.1127, 0.0944, 0.0786, 0.0652, 0.0538, 0.0441, 0.0360,
        0.0292, 0.0000], device='mps:0')

sigmas = self.karras_sigmas[-int(1.2*S)-1:] -> 29 sigmas

tensor([3.6092, 3.1686, 2.7749, 2.4239, 2.1117, 1.8346, 1.5892, 1.3726, 1.1817,
        1.0141, 0.8673, 0.7391, 0.6275, 0.5307, 0.4469, 0.3748, 0.3129, 0.2599,
        0.2148, 0.1767, 0.1444, 0.1174, 0.0948, 0.0761, 0.0606, 0.0479, 0.0375,
        0.0292, 0.0000], device='mps:0')

Any-Winter-4079 · 2022-10-07T12:09:14Z

To get to the image obtained with sigmas = self.karras_sigmas[-int(1.2*S)-1:]) using 7541c7c you need to increase strength by 20% (not steps).
"photograph of a tree on a hill with a river" -S 12345 -W 512 -H 512 -C 7.5 -I 193515615-921b5f80-a2f3-4351-9459-b8e447ea765d.png -A k_euler -f0.72 -s40

Any-Winter-4079 · 2022-10-07T12:17:34Z

I guess it's a matter of preference. For example, with 7541c7c we can't obtain this image from sigmas = self.karras_sigmas[-int(1.2*S)-1:] at -f0.9

Because we'd need 0.9 strength * 1.2 = 1.08
But for example we can obtain this image from sigmas = self.karras_sigmas[-int(1.1*S)-1:] at -f0.9

by setting strength to 0.99

So we're basically shifting the strength to start at more or less creativity, which affects how far into creativity we can go when we reach -f1

lstein · 2022-10-07T13:46:58Z

There's also a churn variable in the sampling algorithms which increases stochasticity at each step. It's been set to zero, but I think you get more "creativity" for positive values. The major question is what options do we expose to the user?

Birch-san · 2022-10-08T18:05:38Z

I would expose sigma_min, sigma_max and rho (mostly relevant for users trying to get good results at low number of sampler steps, by tactically choosing the range and curve of their schedule).

rho explained here:
https://twitter.com/Birchlabs/status/1576705558177935361

deliberate exclusion of sigmas from the schedule (e.g. increased sigma_min) demonstrated here:
crowsonkb/k-diffusion#23

the sigmas on which the model trained (model.sigmas) are known. not sure what UI element would be appopriate for restricting your choice to these:
https://gist.github.com/Birch-san/6cd1574e51871a5e2b88d59f0f3d4fd3

and yes, exposing churn is a good idea. I saw in k-diffusion's clip-guided diffusion example that a typical value for churn is 50.

one problem with churn at the moment is that k-diffusion doesn't yet provide any way to discretize the sigma_hats that arise after applying churn, so the model is asked to denoise sigmas on which it never trained. I think it'd probably still look relatively good though.

Birch-san · 2022-10-08T18:09:07Z

personally I find the idea of strength a little hard to predict; I'd prefer something like a sigma cutoff. so you say "make a 20-step noise schedule, keep only the sigmas higher than 4.0300). you'd pick sigmas from here:
https://gist.github.com/Birch-san/6cd1574e51871a5e2b88d59f0f3d4fd3

Any-Winter-4079 · 2022-10-08T18:40:54Z

Tweaking sigma_min was interesting and definitely seemed to make a difference. https://github.com/invoke-ai/InvokeAI/discussions/914#discussioncomment-3800884
Which reminds me, I have it pending to test sigma_min for more prompts. Maybe I'll do another document similar to https://github.com/invoke-ai/InvokeAI/blob/development/docs/help/SAMPLER_CONVERGENCE.md to share the results.

lstein added the bug Something isn't working label Oct 3, 2022

psychedelicious mentioned this issue Oct 3, 2022

Img2Img Not working on K-Samplers at low Img2Img strength #898

Closed

Any-Winter-4079 added enhancement New feature or request and removed bug Something isn't working labels Oct 7, 2022

Any-Winter-4079 mentioned this issue Oct 7, 2022

Remove extra randn call in ksampler leading to break in reproducibility #977

Closed

lstein closed this as completed in 2c27e75 Oct 10, 2022

austinbrown34 pushed a commit to cognidesign/InvokeAI that referenced this issue Dec 30, 2022

fix invoke-ai#889 - fuzzy k* img2img at low strength

1674ebd

Something wrong with img2img samplers #889

Something wrong with img2img samplers #889

Comments

netsvetaev commented Oct 2, 2022 • edited Loading

lstein commented Oct 2, 2022

psychedelicious commented Oct 3, 2022 • edited Loading

ranhalprin commented Oct 3, 2022 • edited Loading

psychedelicious commented Oct 3, 2022 • edited Loading

lstein commented Oct 4, 2022

mh-dm commented Oct 4, 2022

psychedelicious commented Oct 4, 2022 • edited Loading

damian0815 commented Oct 4, 2022

psychedelicious commented Oct 4, 2022

damian0815 commented Oct 4, 2022

damian0815 commented Oct 4, 2022 • edited Loading

psychedelicious commented Oct 4, 2022

damian0815 commented Oct 4, 2022 • edited Loading

Any-Winter-4079 commented Oct 4, 2022 • edited Loading

Any-Winter-4079 commented Oct 4, 2022

Any-Winter-4079 commented Oct 4, 2022

psychedelicious commented Oct 4, 2022

Any-Winter-4079 commented Oct 4, 2022

Any-Winter-4079 commented Oct 4, 2022 • edited Loading

Any-Winter-4079 commented Oct 4, 2022

Any-Winter-4079 commented Oct 4, 2022 • edited Loading

lstein commented Oct 5, 2022

lstein commented Oct 5, 2022

Any-Winter-4079 commented Oct 5, 2022 • edited Loading

Any-Winter-4079 commented Oct 5, 2022

lstein commented Oct 5, 2022

Any-Winter-4079 commented Oct 5, 2022 • edited Loading

lstein commented Oct 5, 2022

lstein commented Oct 6, 2022 • edited Loading

Birch-san commented Oct 6, 2022 • edited Loading

lstein commented Oct 6, 2022

lstein commented Oct 6, 2022

lstein commented Oct 6, 2022 • edited Loading

Birch-san commented Oct 6, 2022

lstein commented Oct 6, 2022

Any-Winter-4079 commented Oct 6, 2022 • edited Loading

Any-Winter-4079 commented Oct 6, 2022 • edited Loading

Any-Winter-4079 commented Oct 6, 2022 • edited Loading

Birch-san commented Oct 6, 2022

hipsterusername commented Oct 6, 2022

lstein commented Oct 6, 2022

lstein commented Oct 6, 2022

lstein commented Oct 6, 2022 • edited Loading

lstein commented Oct 6, 2022

Birch-san commented Oct 6, 2022

Any-Winter-4079 commented Oct 7, 2022 • edited Loading

Any-Winter-4079 commented Oct 7, 2022 • edited Loading

Any-Winter-4079 commented Oct 7, 2022 • edited Loading

lstein commented Oct 7, 2022

Birch-san commented Oct 8, 2022

Birch-san commented Oct 8, 2022

Any-Winter-4079 commented Oct 8, 2022 • edited Loading

netsvetaev commented Oct 2, 2022 •

edited

Loading

psychedelicious commented Oct 3, 2022 •

edited

Loading

ranhalprin commented Oct 3, 2022 •

edited

Loading

psychedelicious commented Oct 3, 2022 •

edited

Loading

psychedelicious commented Oct 4, 2022 •

edited

Loading

damian0815 commented Oct 4, 2022 •

edited

Loading

damian0815 commented Oct 4, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 4, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 4, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 4, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 5, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 5, 2022 •

edited

Loading

lstein commented Oct 6, 2022 •

edited

Loading

Birch-san commented Oct 6, 2022 •

edited

Loading

lstein commented Oct 6, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 6, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 6, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 6, 2022 •

edited

Loading

lstein commented Oct 6, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 7, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 7, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 7, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 8, 2022 •

edited

Loading