Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Textual Inversion (embeddings) #129

Closed
wants to merge 6 commits into from

Conversation

FSSRepo
Copy link
Contributor

@FSSRepo FSSRepo commented Dec 29, 2023

I made a quick implementation to support embeddings. It's not the best way to do it, but when the code is refactored, the handling of embeddings could be improved. In some cases, there are very large embeddings that could consume beyond the available context and could lead to overflow or errors.

You can specify the embedding to use by adding the embedding filename to the prompt (usually negative prompt), separate it with ,. Like this:

sd -p "a lovely cat" -n "EasyNegative, bad image" --embd-dir models/embeddings

NOTE: EasyNegative It uses a space of 75 tokens, so there are only 2 tokens left.

@leejet Any suggestions to improve this?

@FSSRepo
Copy link
Contributor Author

FSSRepo commented Dec 29, 2023

@Green-Sky Try to see if the errors are fixed with the last commit I made. You don't necessarily have to use the feature addressed in this PR.

@diimdeep
Copy link

diimdeep commented Dec 29, 2023

This is great.
but is max_position_embeddings = 77 sane default ? Not sure how pytorch based implemented but afaik they do not have problem with very long promts even at the same time with multiple embeddings.

@Green-Sky
Copy link
Contributor

@FSSRepo finally, the funky neon looking vae artifacts are gone! 🎉

I noticed the vae tiling has the lowres artifacts still.

eg
output_5

@diimdeep
Copy link

diimdeep commented Dec 29, 2023

from https://civitai.com/models/72437/baddream-unrealisticdream-negative-embeddings

[DEBUG] stable-diffusion.cpp:1529 - parse 'ugly embd:unrealisticdream' to [['ugly embd:unrealisticdream', 1], ]
[INFO]  model.cpp:641  - load models/embeddings/unrealisticdream.pt using checkpoint format
[DEBUG] model.cpp:1142 - init from 'models/embeddings/unrealisticdream.pt'
[DEBUG] model.cpp:1224 - loading tensors from models/embeddings/unrealisticdream.pt
ggml_new_object: not enough space in the context's memory pool (needed 57216, available 32768)
[DEBUG] stable-diffusion.cpp:1529 - parse 'blurry embd:baddream' to [['blurry embd:baddream', 1], ]
[INFO]  model.cpp:641  - load models/embeddings/baddream.pt using checkpoint format
[DEBUG] model.cpp:1142 - init from 'models/embeddings/baddream.pt'
[DEBUG] model.cpp:1224 - loading tensors from models/embeddings/baddream.pt
ggml_new_object: not enough space in the context's memory pool (needed 106368, available 32768)```

@FSSRepo
Copy link
Contributor Author

FSSRepo commented Dec 29, 2023

@diimdeep As mentioned in the description, excessively large embeddings are not supported. I don't know how A1111 or ComfyUI manage very long prompts. They can be truncate, something like splitting the embeddings in half and only including the first half, IDK?

@leejet
Copy link
Owner

leejet commented Dec 30, 2023

Any suggestions to improve this?

I think we can take a cue from how sd-webui operates. Upon detecting that a token in the prompt corresponds to a local embedding file, we can replace the token with its embedding rather than explicitly specifying it using embd:. By the way, I'm currently refactoring the project, and once it's done, it should support long prompts.

@FSSRepo
Copy link
Contributor Author

FSSRepo commented Jan 2, 2024

@leejet I am thinking of merging this pull request with #131 to incorporate the new refactoring changes all at once.

@Green-Sky
Copy link
Contributor

Green-Sky commented Jan 2, 2024

@FSSRepo @slaren Since the sync AFTER set fixes the issue, i am pretty sure that slow-ish PCIe speeds or similar seem to break it. From what i can find online, it seems the sync needs to happen after the memcpy, since a different stream(?) might already be accessing the data before it is ready.

edit: feels like this sync should be after instead of before https://github.com/ggerganov/ggml/blob/fca1caafea7de9fbd7efc733b9818f9cf2da3050/src/ggml-cuda.cu#L9684C41-L9684C41
currently its protecting "writing while still copying to it"

@slaren
Copy link

slaren commented Jan 2, 2024

That's probably the cause. I expected cudaMemcpy to be synchronous in all cases, but that's not what happens. The documentation says this:

For transfers from pageable host memory to device memory, a stream sync is performed before the copy is initiated. The function will return once the pageable buffer has been copied to the staging memory for DMA transfer to device memory, but the DMA to final destination may not have completed.

So cudaMemcpy may indeed return before the copy is completed, and a slow PCIe bus could cause the DMA transfer to not complete before the kernel is launched.

@leejet
Copy link
Owner

leejet commented Jan 2, 2024

@leejet I am thinking of merging this pull request with #131 to incorporate the new refactoring changes all at once.

Great, I think this will reduce some workload.

@FSSRepo
Copy link
Contributor Author

FSSRepo commented Jan 2, 2024

This changes were merged to #131

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants