Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: Enable non-strict loading of state dicts #295

Conversation

BenjaminBossan
Copy link
Member

What does this PR do?

Resolves #278

PyTorch allows to load state dicts with they strict=False argument to ignore missing keys. This is now also supported in optimum-quanto. Before this fix, a KeyError would be raised.

One context where this is important is for parameter-efficient fine-tuning adapters such as LoRA. There, we want to load only a small subset of parameters and leave the other model weights untouched. This requires non-strict loading.

Before submitting

  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you run all tests locally and make sure they pass. Only subset
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@BenjaminBossan
Copy link
Member Author

@dacorvo I created the PR to fix non-strict loading. As the code changed a bit compared to what I had on the issue, and since I wanted to support int4 and int8, the changes are a bit different from what we discussed there. LMK if something should be changed or is still missing.

Apparently the commit is not "conventional", not sure what I missed there. Should I try to fix this via rebase or can this be fixed later via squash+merge?

@dacorvo
Copy link
Collaborator

dacorvo commented Aug 26, 2024

Apparently the commit is not "conventional", not sure what I missed there. Should I try to fix this via rebase or can this be fixed later via squash+merge?

You can amend your commit to fix: Enable non-strict loading of state dicts to make it conventional and force-push.

Resolves huggingface#278

PyTorch allows to load state dicts with they strict=False argument to
ignore missing keys. This is now also supported in optimum-quanto.
Before this fix, a KeyError would be raised.

One context where this is important is for parameter-efficient
fine-tuning adapters such as LoRA. There, we want to load only a small
subset of parameters and leave the other model weights untouched. This
requires non-strict loading.
@BenjaminBossan BenjaminBossan force-pushed the fix-enable-non-strict-loading-of-state-dicts branch from a9ec7fa to 8b59252 Compare August 26, 2024 16:29
@BenjaminBossan
Copy link
Member Author

You can amend your commit to fix: Enable non-strict loading of state dicts to make it conventional and force-push.

Thanks, done.

Copy link
Collaborator

@dacorvo dacorvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this neat pull-request. Looking forward to see how you used quanto in peft !

@dacorvo dacorvo merged commit f9b71f4 into huggingface:main Aug 27, 2024
15 checks passed
@BenjaminBossan BenjaminBossan deleted the fix-enable-non-strict-loading-of-state-dicts branch September 23, 2024 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Non-strict loading of the state dict
2 participants