Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] preference datagen for DOP\ORPO traing etc #1403

Open
1 of 2 tasks
zjrwtx opened this issue Jan 6, 2025 · 2 comments · May be fixed by #1432
Open
1 of 2 tasks

[Feature Request] preference datagen for DOP\ORPO traing etc #1403

zjrwtx opened this issue Jan 6, 2025 · 2 comments · May be fixed by #1432
Assignees
Labels
Data Related to camel data processing enhancement New feature or request
Milestone

Comments

@zjrwtx
Copy link
Collaborator

zjrwtx commented Jan 6, 2025

Required prerequisites

Motivation

The preference dataset is used for reward model training, DPO training, and ORPO training. For system instructions and human inputs, the preference dataset provides a better answer and a worse answer.
so i think preference datagen is really important

Solution

core and cookbook

Alternatives

No response

Additional context

No response

@zjrwtx zjrwtx added enhancement New feature or request Data Related to camel data processing call for contribution labels Jan 6, 2025
@Wendong-Fan Wendong-Fan added this to the Sprint 21 milestone Jan 9, 2025
@Wendong-Fan
Copy link
Member

lead: @zjrwtx ; support & review: @mohamadkav , @AveryYay

@zjrwtx zjrwtx linked a pull request Jan 10, 2025 that will close this issue
13 tasks
@zjrwtx
Copy link
Collaborator Author

zjrwtx commented Jan 10, 2025

#1432

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data Related to camel data processing enhancement New feature or request
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

4 participants