img2img will soon work with all samplers #790

lstein · 2022-09-24T22:50:05Z

Just a note to update everyone who has been waiting on this feature that I've added img2img support for all the samplers to my local repository, This includes all the k* samplers. I will putting up a PR sometime Sunday after I have thoroughly tested the changes.

Unfortunately I had to make minor code changes to the k-diffusion library in order for this to work, so we will be back to using my fork of the k-diffusion code. However, I will also make a PR against @Birch-san 's k-diffusion, which is what we have been using for the upstream.

After confirming the samplers are all working, I'll add the awaited enhancements to inpainting and outpainting, and hopefully deal with the recent new bug reports and pull requests.

Birch-san · 2022-09-24T22:57:12Z

nice! what changes did you need to make? I've been using img2img with k-diffusion samplers just fine, without any changes (except for general changes related to Mac support).

Birch-san · 2022-09-24T22:59:22Z

adding img2img support to text2img, including k-diffusion support, just looked like this (copied how Deforum Diffusion did it):
Birch-san/stable-diffusion@7b42d40

lstein · 2022-09-25T12:11:56Z

I basically did the same thing, except that I have refactored the samplers so that they inherit from a common code base (reducing the code length by about 50%), and got rid of decode(), so that both img2img and txt2img use sample() uniformly.

I also changed the k-diffusion sample_* functions so that I had access to the sample update function (the inner loop), rather than rely on their built in iteration. This is needed in order to support inpainting, where the mask is applied to the sample at every iteration. I futzed around with the callback, but it didn't seem to provide a way to alter the sample within the loop.

Birch-san · 2022-09-25T12:30:56Z

it's worth mentioning this use-case to @crowsonkb; Katherine is considering what methods need to be exposed by the model base classes, and if there's a hook you need in sampling: that might be a relevant input into how the API should be designed.

the tensor decorate() callback parameter I proposed here might be a generalized pattern useful for hooking into and fiddling with particular steps of the sampling algorithms.

lstein · 2022-09-25T20:05:44Z

I've been considering how to use the callbacks so I don't have to modiy Katherine's code, Maybe I'm not understanding it properly. The issue is that the inpainting code I'm using needs to insert itself into the middle of the sampling loop in order to update the image with the mask. So a typical piece of sampling code looks like:

def k_sampler(img, callback, args):
      for i in range(steps):
           ...lots of math...
            callback({'image':img,'args':args})
           ...lots more math

The sample is in img and it is updated each time through the loop. The loop executes step times, and the mask code wants to get in there to modify img and pass it back to the next run through the loop. However, I don't think that the callback can modify the variables in the function.

lstein · 2022-09-25T20:17:38Z

I've also been having a hard time puzzling out how your k-sampler/img2img integration works. You're making a call to a sampler object's decode() method to run the sampling. I'd like to see how this code talks to the k-sampler functions, but when I look for sampler modules in your distribution's directories, all I can find are ddm, ddim, and plm.

Meanwhile there's something wrong with my own implementation. The k_* samplers require a large number of steps or high f to produce satisfactory results. I've been searching, but I've not found what I'm doing wrong.

Birch-san · 2022-09-25T21:39:08Z

The k-diffusion sampling code is just this:
Birch-san/stable-diffusion@6a74a6a

Birch-san · 2022-09-25T21:44:30Z

Heun can get good results in few steps (e.g. 7):
https://twitter.com/Birchlabs/status/1564792349221330944

The key is to use the noise schedule that Karras et al proposed in the same paper (k-diffusion's get_sigmas_karras() function), discretize the sigmas, and set the sigma_min slightly above the minimum supported by the model:
crowsonkb/k-diffusion#23

Here's an exploration of the impact of which sigmas you sample from:
https://twitter.com/Birchlabs/status/1565114066548527104?s=20&t=axLeSaZV28s5m104_vMdxQ

lstein · 2022-09-27T07:51:05Z

This is hugely helpful. Just to be clear, the CompVisDenoiser() object in the k-diffusion external.py module, comes with a get_sigmas() method. To use the Karras sigmas, I should ignore what get_sigmas() gives me and use the discretized sigmas from the get_sigmas_karras() instead?

Given that I'm motivated to do this for img2img, will making this change have a bad effect on txt2img?

tildebyte · 2022-09-28T00:02:38Z

I'd definitely like to be able to test it out with txt2image (been following Birch-san talking about the Karras noise schedule with great interest)

Birch-san · 2022-09-28T00:21:19Z

@lstein yes, a key finding of (Karras et al 2022) was that we can decouple the noise schedule from the model. that's why we stop using a method of the model wrapper, and use a schedule that is constructed moreorless without knowledge of the model at all.

neither approach will inherently give you discretized sigmas. but if you construct your CompVisDenoiser with quantize=true -- and are sure not to change the sampler's default of churn=0 -- then you'll get discretized sigmas, yes.

if you want to play around with increasing churn (i.e. to inject noise, to make it more "creative" / varied -- similar to the effect you observe with Euler ancestral), then you would need to wait for crowsonkb/k-diffusion#23 to be sure the sigmas are still discretized.

there would be no bad effect for txt2img, only positive effects. this basically just implements the state-of-the-art sampling recommendations from that Karras paper.

tildebyte · 2022-09-28T01:46:11Z

if you want to play around with increasing churn (i.e. to inject noise, to make it more "creative" / varied

Yes, please 🤣

lstein · 2022-09-28T12:04:49Z

I changed the sigmas to use the Karras schedule yesterday but had to get on a flight and haven't tested it thoroughly yet. All I can say right now is that it didn't actively break anything. I really appreciate your help with this, and I'll check back with you in a few days when I resume work on this branch. Lincoln

…

On Tue, Sep 27, 2022 at 8:21 PM Birch-san ***@***.***> wrote: @lstein <https://github.com/lstein> yes, a key finding of (Karras et al 2022) <https://arxiv.org/abs/2206.00364> was that we can decouple the noise schedule from the model. that's why we stop using a method of the model wrapper, and use a schedule that is constructed moreorless without knowledge of the model at all. neither approach will inherently give you discretized sigmas. but if you construct your CompVisDenoiser with quantize=true <https://github.com/Birch-san/stable-diffusion/blob/87da4c85ce107b779937b66a0fe9f95817e65722/scripts/txt2img_fork.py#L633>, and be sure not to change the default of churn=0, then you'll get discretized sigmas, yes. if you *want* to play around with increasing churn (i.e. to inject noise, to make it more "creative" / varied -- similar to the effect you observe with Euler ancestral), then you would need to wait for crowsonkb/k-diffusion#23 <crowsonkb/k-diffusion#23> to be sure the sigmas are still discretized. there would be no bad effect for txt2img, only positive effects. this basically just implements the state-of-the-art sampling recommendations from that Karras paper. — Reply to this email directly, view it on GitHub <#790 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAA3EVOPF3ESBMKADEMXP7DWAOFQVANCNFSM6AAAAAAQU2Q6FQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- *Lincoln Stein* Head, Adaptive Oncology, OICR Senior Principal Investigator, OICR Professor, Department of Molecular Genetics, University of Toronto Tel: 416-673-8514 Cell: 416-817-8240 ***@***.*** *E**xecutive Assistant* Michelle Xin Tel: 647-260-7927 ***@***.*** ***@***.***>* *Ontario Institute for Cancer Research* MaRS Centre, 661 University Avenue, Suite 510, Toronto, Ontario, Canada M5G 0A3 @OICR_news <https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Foicr_news&data=04%7C01%7CMichelle.Xin%40oicr.on.ca%7C9fa8636ff38b4a60ff5a08d926dd2113%7C9df949f8a6eb419d9caa1f8c83db674f%7C0%7C0%7C637583553462287559%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=PS9KzggzFoecbbt%2BZQyhkWkQo9D0hHiiujsbP7Idv4s%3D&reserved=0> | www.oicr.on.ca *Collaborate. Translate. Change lives.* This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.

lstein · 2022-10-01T01:41:32Z

Well, the Karras noise schedule seems to work very well. The k-samplers are doing great for img2img. @Birch-san thank you for the help! I found your code particularly helpful.

One thing however, that I found puzzling. In your repository's call to K.sampling.get_sigmas_karras(), you include a keyword argument of concat_zero. However, I don't see this keyword in the method call signature in either your or crowson's k-diffusion repositories. Does this append an extra zero to the karras noise array, and if so, why is that necessary?

Birch-san · 2022-10-01T02:07:08Z

concat_zero=True is equivalent to the default k-diffusion behaviour; I exposed that to provide a way to not end the sigma ramp with a zero.

See the first post in this thread for what concat_zero did and why it appeared to help:
crowsonkb/k-diffusion#23

I thought "if discretization means we're only allowed to sample from sigmas the model was trained on: we shouldn't include 0 in our schedule, since the model never trained on that".

So I ramped down to sigma_min and stopped there instead of continuing to 0. It gave way better results (for small number of sample steps), but mostly because it has the effect of making the ramp longer, which means sampling from more small sigmas.
Katherine pointed out that it was mathematically wrong to exclude zero (means you don't fully denoise the image), and the significant part of what I did was equivalent to picking a higher sigma_min (instead of going down to the smallest sigma the model supports).
I also checked the paper more closely, and zero is a special case (never actually sampled from) so it doesn't matter that it's not a sigma the model trained on.

So yeah, I removed concat_zero=False and instead provided a function to compute the same sigma_min that it gave (effectively the penultimate sigma from a noise ramp 1 longer than your current one, going down to the model's smallest sigma).

Birch-san mentioned this issue Sep 25, 2022

Textual Inversion Training on M1 (works!) #517

Closed

lstein closed this as completed Oct 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

img2img will soon work with all samplers #790

img2img will soon work with all samplers #790

lstein commented Sep 24, 2022 •

edited

Loading

Birch-san commented Sep 24, 2022

Birch-san commented Sep 24, 2022 •

edited

Loading

lstein commented Sep 25, 2022 •

edited

Loading

Birch-san commented Sep 25, 2022 •

edited

Loading

lstein commented Sep 25, 2022 •

edited

Loading

lstein commented Sep 25, 2022

Birch-san commented Sep 25, 2022 •

edited

Loading

Birch-san commented Sep 25, 2022 •

edited

Loading

lstein commented Sep 27, 2022

tildebyte commented Sep 28, 2022

Birch-san commented Sep 28, 2022 •

edited

Loading

tildebyte commented Sep 28, 2022

lstein commented Sep 28, 2022 via email

lstein commented Oct 1, 2022

Birch-san commented Oct 1, 2022

img2img will soon work with all samplers #790

img2img will soon work with all samplers #790

Comments

lstein commented Sep 24, 2022 • edited Loading

Birch-san commented Sep 24, 2022

Birch-san commented Sep 24, 2022 • edited Loading

lstein commented Sep 25, 2022 • edited Loading

Birch-san commented Sep 25, 2022 • edited Loading

lstein commented Sep 25, 2022 • edited Loading

lstein commented Sep 25, 2022

Birch-san commented Sep 25, 2022 • edited Loading

Birch-san commented Sep 25, 2022 • edited Loading

lstein commented Sep 27, 2022

tildebyte commented Sep 28, 2022

Birch-san commented Sep 28, 2022 • edited Loading

tildebyte commented Sep 28, 2022

lstein commented Sep 28, 2022 via email

lstein commented Oct 1, 2022

Birch-san commented Oct 1, 2022

lstein commented Sep 24, 2022 •

edited

Loading

Birch-san commented Sep 24, 2022 •

edited

Loading

lstein commented Sep 25, 2022 •

edited

Loading

Birch-san commented Sep 25, 2022 •

edited

Loading

lstein commented Sep 25, 2022 •

edited

Loading

Birch-san commented Sep 25, 2022 •

edited

Loading

Birch-san commented Sep 25, 2022 •

edited

Loading

Birch-san commented Sep 28, 2022 •

edited

Loading