Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repaired run output not included in datasplit #101

Open
leonardmq opened this issue Jan 10, 2025 · 1 comment
Open

Repaired run output not included in datasplit #101

leonardmq opened this issue Jan 10, 2025 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@leonardmq
Copy link
Contributor

Issue:
Repairing a run output and clicking Accept repair (5 Stars) does not include the repaired run in the fine-tune datasplits.

Expected:
Accepting a repair should turn the run into a 5 stars and as such the repaired output should be included in newly created fine-tune datasplits that filter for `High Rating (4+ stars).

Version: main at 1f4c281f207f208ea6d956c8e7c23ce6d7aab251


Steps to reproduce:

  1. Create a run
  2. Rate it 3 Stars on the Overall Rating
  3. Repair the output
  4. Accept the repaired output by clicking Accept Repair (5 stars)
  5. In Fine Tune:
    a. Create a fine-tune
    b. Pick Download: OpenAI chat format with tool calls (JSONL) (or any other)
    c. New dataset
    d. In Dataset Filter, select High Rating (4+ stars)
    e. In Dataset Splits, select Entire Dataset -- 100
    f. Create Dataset

The file does not include the repaired run.


The filtering logic for High Rating seems to be done here:

return task_run.output.rating.is_high_quality()

Adding or task_run.repaired_output is not None in the bool check would only fix the filtering, but not the downstream code uses the output rather than the repaired_output - so downstream logic like creating datasplits would use the original output rather than the one coming out of the repair.


What are your plans for how repaired_output should be used?

If repaired_output is used for prompt generation but not included in fine tuning data, maybe then renaming the Accept Repair (5 stars) button could reduce confusion as the 5 stars mention suggests the repair would cause the run to behave as if it were rated 5 stars.

@leonardmq leonardmq changed the title Repaired runs not included in dataset Repaired runs not included in datasplit Jan 10, 2025
@leonardmq leonardmq changed the title Repaired runs not included in datasplit Repaired run output not included in datasplit Jan 10, 2025
@scosman
Copy link
Collaborator

scosman commented Jan 10, 2025

Good catch! I'll fix this. Thanks @leonardmq !

@scosman scosman added the bug Something isn't working label Jan 25, 2025
@scosman scosman self-assigned this Jan 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants