Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flex.1 Alpha LoRA/Finetuning #3056

Open
stepfunction83 opened this issue Jan 21, 2025 · 33 comments
Open

Flex.1 Alpha LoRA/Finetuning #3056

stepfunction83 opened this issue Jan 21, 2025 · 33 comments

Comments

@stepfunction83
Copy link

stepfunction83 commented Jan 21, 2025

I think this would be a good place to discuss finetuning the new Flex.1 Alpha model created by Ostris: https://huggingface.co/ostris/Flex.1-alpha

Initial tests I've tried on training LoRAs using ai-toolkit are extremely promising, with LoRAs being able to be trained much more smoothly than with Flux.1 Dev.

Currently, I believe we can train this in Kohya in a similar way to how the un-distilled versions of Flux have been trained, by treating them as Flux Schnell to bypass the guidance mechanism. Until this is built in though, you can force it by temporarily changing line 62 in library/flux_utils.py, line 62 from:

is_schnell = not ("guidance_in.in_layer.bias" in keys or "time_text_embed.guidance_embedder.linear_1.bias" in keys)

to:

is_schnell = True

This lets the finetuning/LoRA process begin a training run. I'm currently doing a test run of this and will post about how it goes. Obviously, it hasn't been out particularly long, but so far I have been able to start a finetuning run and the loss seems to be decreasing.

Due to the model's smaller size, it can fit entirely on a 24GB card with the fused backward pass and no block swap, resulting in faster training iterations (average of 2.54s/it on my 4090 when using an even mix of 512/768/1024 resolution images). I used exactly the same config that I use for a normal Flux run, only swapping out the model file.

Samples are garbled, as they were with the undistilled versions, so I expect there will need to be some fixes there, but they're not beyond recognition.

@stepfunction83
Copy link
Author

My initial attempt with a LR or 1e-5 overtrained rapidly. A second attempt with a LR of 2e-6 seems to be more stable so far.

@stepfunction83
Copy link
Author

Before anyone else tries this, it seems to break the guidance module that Ostris created. It seems some more work will be needed to explicitly exclude that from training.

@CodeAlexx
Copy link

I'm just using standard flux settings with block swapping and other stuff at 1.8-5 lr at 24k steps already and samples are decent, also just a dataset of all 1024 x 1024. No multires res stuff . My 3090 does 7.89 it's..I will try your method later when I get to 50k steps

@AfterHAL
Copy link

My 3090 does 7.89 it's..

@CodeAlexx . Are you saying 7.89 seconds per iterations ?

@stepfunction83
Copy link
Author

My 3090 does 7.89 it's..

@CodeAlexx . Are you saying 7.89 seconds per iterations ?

That sounds about right given it's all 1024 resolution images. I'm using a 512/768/1024 blend on a 4090 for the 2.5s/it times.

@CodeAlexx
Copy link

Yes, near 8 seconds per step. How are your samples with your hack ?

@stepfunction83
Copy link
Author

The samples look decent, but if you try to use it in Comfy, you'll find that guidance no longer works. The training process is likely training that part of the network as well, not realizing that it should be ignoring it.

@stepfunction83
Copy link
Author

stepfunction83 commented Jan 23, 2025

kohya-ss/sd-scripts#1891 (comment)

If anyone wants to play with this, I've created a minimal working example here:

https://github.com/stepfunction83/sd-scripts/tree/sd3

With this commit just brute forcing in the relevant code snippets from ai-toolkit:

kohya-ss/sd-scripts@b203e31

I was able to quickly train a 1000 step finetune of flex and was able to test it in Comfy to validate that the training does take and the guidance module is not destroyed in the process.

Additionally, the sampling was corrected as well and now works as expected.

You can replace the default sd-scripts installation that comes with Kohya with this one and replace the Flux model file with the Flex version. Make sure to do this when the server is already running. Kohya_ss gets the latest version of the official sd-scripts repo when it first starts up.

(You can probably tell I don't have much experience with this...)

@CodeAlexx
Copy link

THANK YOU!! i am new to git and how to use it, can i just download the three changed files and replace them.

@stepfunction83
Copy link
Author

stepfunction83 commented Jan 23, 2025

Make sure to pass the --bypass_flux_guidance parameter with the latest commit, and yes, you can just replace the respective files with the ones from the forked version.

@CodeAlexx
Copy link

thank you sooooo much!

@stepfunction83
Copy link
Author

Yep, let me know how your experience goes. I'll submit a PR once I get it in a slightly better state.

@CodeAlexx
Copy link

i will, i won't use for lora but finetuning with no block swaps

@stepfunction83
Copy link
Author

stepfunction83 commented Jan 23, 2025 via email

@stepfunction83
Copy link
Author

Created a PR to add the functionality to sd-scripts: kohya-ss/sd-scripts#1893

@stepfunction83
Copy link
Author

stepfunction83 commented Jan 23, 2025

From my experimentation with finetuning so far, I've found that lower learning rates are needed than with Flux Dev. 5e-6, Cosine, 5000 steps destroyed hands and general composition, but 1e-6, Cosine, 10000 steps seems to be more stable so far, but it may be worth going even lower than that.

When sampling, I would recommend a guidance of 5. The guidance module is not the same as base Flux and the sweet spot seems to be roughly 4.5-5.5

For reference, I'm using 300 medium quality real images and 200 synthetic images to train a concept model.

It's also very quick to train vs finetuning Flux Dev. I'm getting 2.25s/it with a 50/50 512/768 resolution mixed training set on a 4090 using only 19GB of VRAM. With a purely 512 dataset and a couple blocks offloaded, I could definitely see this being trained on a 16GB card.

@CodeAlexx
Copy link

today i am off work and running it. mine is a 1.8-6 lr and at 8000 steps still holding up.hands and compostion still good ..i was using 2 in cfg and took it to 5 and it works very good. my dataset is 6k real high quality pics, 512k and 1024, both square.

@stepfunction83
Copy link
Author

I think an LR even lower than 1e-6 may be better. Even with that, it trains quickly and reaches approximately the same place as a 5e-6 LR in 5000 steps, with fewer artifacts and quality loss. In my next run, I'll go down further to 1e-7 to see how that goes.

@CodeAlexx
Copy link

i tried to do a new session, seems as your instructions to del sdscripts dir and clone your version line is gone. can you repost so the command line will work again..

@CodeAlexx
Copy link

ome/alex/kohya_ss/venv/bin/python3.10: can't open file '/home/alex/kohya_ss/sd-scripts/flux_train.py': [Errno 2] No such file or directory
Traceback (most recent call last):
File "/home/alex/kohya_ss/venv/bin/accelerate", line 8, in
sys.exit(main())
File "/home/alex/kohya_ss/venv/

it is missing flux_train.py

@stepfunction83
Copy link
Author

stepfunction83 commented Jan 23, 2025

Okay, I've updated the functionality and I think it should be working now. Apparently sd-scripts already has a convenient toggle for turning flux guidance on and off in the Flux model parameters. Feel free to go ahead and try again.

So the steps to run are:

  1. Start Kohya server
  2. Delete sd-scripts folder
  3. Run in a terminal inside the Kohya folder: git clone https://github.com/stepfunction83/sd-scripts -b sd3
  4. Attempt training run

I'm doing a training run which is comparable to a previous one I've done, so I'll see how the results compare once it gets a little further in.

@CodeAlexx
Copy link

thank you, in about 2 hours my current training will be over and then i will use it. thank you for your hard work! What setting did you find the best? i use cosine, no warmup..

@stepfunction83
Copy link
Author

stepfunction83 commented Jan 23, 2025 via email

@stepfunction83
Copy link
Author

500 steps in with the same settings as a previous run and I'm already seeing different results. Seems like it is in fact making a difference this time.

@stepfunction83
Copy link
Author

stepfunction83 commented Jan 24, 2025

Okay, now we're working! Results on the full finetune (4e-6 LR, 5000 Steps, Cosine) in comfy are dramatically better! Both scene composition and hand quality are back to normal. I'm going to attempt a longer run with a lower initial LR of 1e-6, 10000 Steps, Cosine, with 1024 images introduced into mix and see how that goes.

@CodeAlexx
Copy link

CodeAlexx commented Jan 24, 2025

it am at 3000 and they are very good. it is planned for 20k steps.. i am getting 5.79 it's( a 3090) a mixture of 512, 1024 and some 1300 res pics. 1-6e lr, same -- cosine, i am using adafactor

@CodeAlexx
Copy link

tested at 3500k steps -- loving the output! Thank you.

@BenDes21
Copy link

Okay, now we're working! Results on the full finetune (4e-6 LR, 5000 Steps, Cosine) in comfy are dramatically better! Both scene composition and hand quality are back to normal. I'm going to attempt a longer run with a lower initial LR of 1e-6, 10000 Steps, Cosine, with 1024 images introduced into mix and see how that goes.

Hello there, is Flex 100% supported by kohya_ss ( GUI ) ? Is it possible to have an example config ? Thanks for your work

@stepfunction83
Copy link
Author

Okay, now we're working! Results on the full finetune (4e-6 LR, 5000 Steps, Cosine) in comfy are dramatically better! Both scene composition and hand quality are back to normal. I'm going to attempt a longer run with a lower initial LR of 1e-6, 10000 Steps, Cosine, with 1024 images introduced into mix and see how that goes.

Hello there, is Flex 100% supported by kohya_ss ( GUI ) ? Is it possible to have an example config ? Thanks for your work

See my comment above: #3056 (comment)

@BenDes21
Copy link

BenDes21 commented Feb 5, 2025

Okay, now we're working! Results on the full finetune (4e-6 LR, 5000 Steps, Cosine) in comfy are dramatically better! Both scene composition and hand quality are back to normal. I'm going to attempt a longer run with a lower initial LR of 1e-6, 10000 Steps, Cosine, with 1024 images introduced into mix and see how that goes.

Hello there, is Flex 100% supported by kohya_ss ( GUI ) ? Is it possible to have an example config ? Thanks for your work

See my comment above: #3056 (comment)

Hi there !

Im getting

File "/workspace/kohya_ss/sd-scripts/library/train_util.py", line 2080, in init
raise ValueError(f"no metadata / メタデータファイルがありません: {subset.metadata_file}")
ValueError: no metadata / メタデータファイルがありません: /meta_lat.json

when running the fork, is the meta_lat.json mandatory ?

@stepfunction83
Copy link
Author

Okay, now we're working! Results on the full finetune (4e-6 LR, 5000 Steps, Cosine) in comfy are dramatically better! Both scene composition and hand quality are back to normal. I'm going to attempt a longer run with a lower initial LR of 1e-6, 10000 Steps, Cosine, with 1024 images introduced into mix and see how that goes.

Hello there, is Flex 100% supported by kohya_ss ( GUI ) ? Is it possible to have an example config ? Thanks for your work

See my comment above: #3056 (comment)

Hi there !

Im getting

File "/workspace/kohya_ss/sd-scripts/library/train_util.py", line 2080, in init raise ValueError(f"no metadata / メタデータファイルがありません: {subset.metadata_file}") ValueError: no metadata / メタデータファイルがありません: /meta_lat.json

when running the fork, is the meta_lat.json mandatory ?

Sounds like you're on the Finetune tab instead of the Dreambooth one.

@BenDes21
Copy link

BenDes21 commented Feb 5, 2025

Okay, now we're working! Results on the full finetune (4e-6 LR, 5000 Steps, Cosine) in comfy are dramatically better! Both scene composition and hand quality are back to normal. I'm going to attempt a longer run with a lower initial LR of 1e-6, 10000 Steps, Cosine, with 1024 images introduced into mix and see how that goes.

Hello there, is Flex 100% supported by kohya_ss ( GUI ) ? Is it possible to have an example config ? Thanks for your work

See my comment above: #3056 (comment)

Hi there !
Im getting
File "/workspace/kohya_ss/sd-scripts/library/train_util.py", line 2080, in init raise ValueError(f"no metadata / メタデータファイルがありません: {subset.metadata_file}") ValueError: no metadata / メタデータファイルがありません: /meta_lat.json
when running the fork, is the meta_lat.json mandatory ?

Sounds like you're on the Finetune tab instead of the Dreambooth one.

Ah yes correct, is they're a difference between theses 2 in term of result or it's just about how is setup the dataset

@stepfunction83
Copy link
Author

I believe it's just needing to specify the Metadata file for the finetune tab. Beyond that, there's not a substantial difference as far as I'm aware.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants