-
Notifications
You must be signed in to change notification settings - Fork 741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enhance: STaR Integration #1514
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
can we further add a method to validate problem format? |
Thanks a lot for this PR! i just took a quick look and here are a few comments:
What do you think? |
Hey @ZIYU-DEEP , thanks for the review and happy Chinese New Year! (i) This is an in-context method designed for test time, whereas STaR is a training-time method requiring reinforcement fine-tuning. (ii) There is no rationalization process as in STaR (i.e., given problem x and true solution y, generate the reasoning trace z).
(iii) Few-shot examples with reasoning traces are missing—we can probably add an entry somewhere to allow users to include them.
Generating multiple solutions in parallel per problem per iteration Adding a README file in the examples folder to help users quickly understand the background and usage Adding fine-tuning support in the future |
thanks @GitHoobar , added validation |
thanks for the addition |
Description
enhancement based on review comment: #1478 (review)
Motivation and Context
Types of changes
What types of changes does your code introduce? Put an
x
in all the boxes that apply:Implemented Tasks
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!