-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[misc] feat: spport rmpad/data-packing in FSDP with transformers #91
base: main
Are you sure you want to change the base?
Conversation
Shall we add a supported model list and raise error if the model is not in the list? |
Try to avoid using log_probs_from_logits_response_rmpad because there is an unpad op inside. unpad is a cuda-blocking op. Instead, we can directly use unpad input_ids from the input |
I think this list depends on transformers lib. No sure where to get this list. I didn't find any doc about the feature in transformers. |
Simply add potential models in the CI. If the model passes CI, then add to the supported list. I guess we can target
|
Sure, I will write a new API for unpad input_ids |
Shall we add the test_transformers.py to CI? I didn't do it as I think it only depends on the transformers version and flash_attn version. So, I guess the goal for the CI is to test whether the latest transformers + flash_attn would break our implementation |
After this PR, we should set a minimum version of transformers |
actor_rollout_ref.model.use_rmpad=True
+critic.model.use_rmpad=True \
+reward_model.model.use_rmpad=True
to enable rmpad for different models. Default set to FalseAutoModelForTokenClassification
for Value and Reward Model. Instead of using SeqenceClassificationlog_probs_from_logits_response_rmpad
Resolve: #53
Comparison using DeepSeek7b and GSM8k:
About 1.7x speedup compare to no rmpad (original cases)