-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support BF16 for FSDP #963
Comments
Thank you for this issue! We are currently working on adding support for bf16 and hope to have it done very soon :) Assuming that you meant support bf16 with FSDP? Or were you thinking of another API? |
Exactly, bf16 with FSDP! |
@anj-s please let me know if there is anything we can do to help, having support for bf16 with FSDP in Fairseq will really really help us! :) |
Hi, has there been any progress with resolving this issue? @anj-s |
Hi @yuvalkirstain, I think this should work without any issues. Can you try using bfloat16 by passing the right compute_dtype argument when using FSDP? Unfortunately i haven't had a chance to add a unit test but perhaps someone else on the team has looked into this. cc @anupambhatnagar @min-xu-ai |
bfloat16 support with pytorch lighting will be better, do you have this consideration? |
Is there currently any progress on this issue? |
There has been no progress on this so far. |
Feature Request
Please support BF16 mixed-precision
Additional context
Training with BF16 is usually more stable than fp16, which is very important when we want to train large models. Additionally, many models (e.g. T5) are trained with BF16 and if we want to continue training them with mixed-precision, using fp16 will result in NaNs.
Thank you!
The text was updated successfully, but these errors were encountered: