-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
img2img will soon work with all samplers #790
Comments
nice! what changes did you need to make? I've been using img2img with k-diffusion samplers just fine, without any changes (except for general changes related to Mac support). |
adding img2img support to text2img, including k-diffusion support, just looked like this (copied how Deforum Diffusion did it): |
I basically did the same thing, except that I have refactored the samplers so that they inherit from a common code base (reducing the code length by about 50%), and got rid of decode(), so that both img2img and txt2img use sample() uniformly. I also changed the k-diffusion sample_* functions so that I had access to the sample update function (the inner loop), rather than rely on their built in iteration. This is needed in order to support inpainting, where the mask is applied to the sample at every iteration. I futzed around with the callback, but it didn't seem to provide a way to alter the sample within the loop. |
it's worth mentioning this use-case to @crowsonkb; Katherine is considering what methods need to be exposed by the model base classes, and if there's a hook you need in sampling: that might be a relevant input into how the API should be designed. the tensor |
I've been considering how to use the callbacks so I don't have to modiy Katherine's code, Maybe I'm not understanding it properly. The issue is that the inpainting code I'm using needs to insert itself into the middle of the sampling loop in order to update the image with the mask. So a typical piece of sampling code looks like:
The sample is in |
I've also been having a hard time puzzling out how your k-sampler/img2img integration works. You're making a call to a sampler object's decode() method to run the sampling. I'd like to see how this code talks to the k-sampler functions, but when I look for sampler modules in your distribution's directories, all I can find are ddm, ddim, and plm. Meanwhile there's something wrong with my own implementation. The k_* samplers require a large number of steps or high f to produce satisfactory results. I've been searching, but I've not found what I'm doing wrong. |
The k-diffusion sampling code is just this: |
Heun can get good results in few steps (e.g. 7): The key is to use the noise schedule that Karras et al proposed in the same paper (k-diffusion's Here's an exploration of the impact of which sigmas you sample from: |
This is hugely helpful. Just to be clear, the Given that I'm motivated to do this for img2img, will making this change have a bad effect on txt2img? |
I'd definitely like to be able to test it out with txt2image (been following Birch-san talking about the Karras noise schedule with great interest) |
@lstein yes, a key finding of (Karras et al 2022) was that we can decouple the noise schedule from the model. that's why we stop using a method of the model wrapper, and use a schedule that is constructed moreorless without knowledge of the model at all. neither approach will inherently give you discretized sigmas. but if you construct your if you want to play around with increasing churn (i.e. to inject noise, to make it more "creative" / varied -- similar to the effect you observe with Euler ancestral), then you would need to wait for crowsonkb/k-diffusion#23 to be sure the sigmas are still discretized. there would be no bad effect for txt2img, only positive effects. this basically just implements the state-of-the-art sampling recommendations from that Karras paper. |
Yes, please 🤣 |
I changed the sigmas to use the Karras schedule yesterday but had to get on
a flight and haven't tested it thoroughly yet. All I can say right now is
that it didn't actively break anything. I really appreciate your help with
this, and I'll check back with you in a few days when I resume work on this
branch.
Lincoln
…On Tue, Sep 27, 2022 at 8:21 PM Birch-san ***@***.***> wrote:
@lstein <https://github.com/lstein> yes, a key finding of (Karras et al
2022) <https://arxiv.org/abs/2206.00364> was that we can decouple the
noise schedule from the model. that's why we stop using a method of the
model wrapper, and use a schedule that is constructed moreorless without
knowledge of the model at all.
neither approach will inherently give you discretized sigmas. but if you
construct your CompVisDenoiser with quantize=true
<https://github.com/Birch-san/stable-diffusion/blob/87da4c85ce107b779937b66a0fe9f95817e65722/scripts/txt2img_fork.py#L633>,
and be sure not to change the default of churn=0, then you'll get
discretized sigmas, yes.
if you *want* to play around with increasing churn (i.e. to inject noise,
to make it more "creative" / varied -- similar to the effect you observe
with Euler ancestral), then you would need to wait for
crowsonkb/k-diffusion#23
<crowsonkb/k-diffusion#23> to be sure the sigmas
are still discretized.
there would be no bad effect for txt2img, only positive effects. this
basically just implements the state-of-the-art sampling recommendations
from that Karras paper.
—
Reply to this email directly, view it on GitHub
<#790 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAA3EVOPF3ESBMKADEMXP7DWAOFQVANCNFSM6AAAAAAQU2Q6FQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
*Lincoln Stein*
Head, Adaptive Oncology, OICR
Senior Principal Investigator, OICR
Professor, Department of Molecular Genetics, University of Toronto
Tel: 416-673-8514
Cell: 416-817-8240
***@***.***
*E**xecutive Assistant*
Michelle Xin
Tel: 647-260-7927
***@***.*** ***@***.***>*
*Ontario Institute for Cancer Research*
MaRS Centre, 661 University Avenue, Suite 510, Toronto, Ontario, Canada M5G
0A3
@OICR_news
<https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Foicr_news&data=04%7C01%7CMichelle.Xin%40oicr.on.ca%7C9fa8636ff38b4a60ff5a08d926dd2113%7C9df949f8a6eb419d9caa1f8c83db674f%7C0%7C0%7C637583553462287559%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=PS9KzggzFoecbbt%2BZQyhkWkQo9D0hHiiujsbP7Idv4s%3D&reserved=0>
| www.oicr.on.ca
*Collaborate. Translate. Change lives.*
This message and any attachments may contain confidential and/or privileged
information for the sole use of the intended recipient. Any review or
distribution by anyone other than the person for whom it was originally
intended is strictly prohibited. If you have received this message in
error, please contact the sender and delete all copies. Opinions,
conclusions or other information contained in this message may not be that
of the organization.
|
Well, the Karras noise schedule seems to work very well. The k-samplers are doing great for img2img. @Birch-san thank you for the help! I found your code particularly helpful. One thing however, that I found puzzling. In your repository's call to |
concat_zero=True is equivalent to the default k-diffusion behaviour; I exposed that to provide a way to not end the sigma ramp with a zero. See the first post in this thread for what concat_zero did and why it appeared to help: I thought "if discretization means we're only allowed to sample from sigmas the model was trained on: we shouldn't include 0 in our schedule, since the model never trained on that". So I ramped down to sigma_min and stopped there instead of continuing to 0. It gave way better results (for small number of sample steps), but mostly because it has the effect of making the ramp longer, which means sampling from more small sigmas. So yeah, I removed concat_zero=False and instead provided a function to compute the same sigma_min that it gave (effectively the penultimate sigma from a noise ramp 1 longer than your current one, going down to the model's smallest sigma). |
Just a note to update everyone who has been waiting on this feature that I've added img2img support for all the samplers to my local repository, This includes all the k* samplers. I will putting up a PR sometime Sunday after I have thoroughly tested the changes.
Unfortunately I had to make minor code changes to the k-diffusion library in order for this to work, so we will be back to using my fork of the k-diffusion code. However, I will also make a PR against @Birch-san 's k-diffusion, which is what we have been using for the upstream.
After confirming the samplers are all working, I'll add the awaited enhancements to inpainting and outpainting, and hopefully deal with the recent new bug reports and pull requests.
The text was updated successfully, but these errors were encountered: