Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for stop sequences to HF models #1188

Merged
merged 5 commits into from
Jan 25, 2025

Conversation

MikhailTerekhov
Copy link
Contributor

This PR contains:

  • New features
  • Changes to dev-tools e.g. CI config / github tooling
  • Docs
  • Bug fixes
  • Code refactor

What is the current behavior? (You can also link to an open issue here)

Currently, HF models do not support stop sequences even though the underlying functionality is present in HuggingFace Transfomers' model.generate. I was not sure whether this is a bug or a missing feature, so I marked both.

What is the new behavior?

This PR adds support for stop_seqs to HF models. I also added a test that checks that stop sequences work now. One can also verify that the test does not pass before the commit that fixes the issue.

Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)

No

Other information:

I also discovered that when a chat template is provided to get_model with an HF model, it is not propagated to the tokenizer. Hence, I needed to separately fix the tokenizer when creating the test fixture model, to avoid the default template from here:

for message in hf_messages:

If this is not the desired behavior, I can add a PR to fix that too. It would be useful for me to be able to provide custom chat templates to HF models.

@jjallaire
Copy link
Collaborator

@MikhailTerekhov thanks for this! I noted there is a MyPy error in CI. Could you take a look at this?

@jjallaire jjallaire merged commit 8863e1f into UKGovernmentBEIS:main Jan 25, 2025
9 checks passed
@MikhailTerekhov
Copy link
Contributor Author

@jjallaire I'm not sure if the chat template behavior I mentioned in the PR is the desired one. Could you please let me know if a fix is needed there?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants