-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smaller test/val loss but lower evaluation accuracy #750
Comments
Hello! It may be related to the scale of your dataset partition. If the test/val dataset is too small, then the loss will be unstable. |
I wonder the phenomenon discussed in your paper is just in low-fidelity scenario or in general FL? |
In the paper, what we observe is in a low-fidelity scenario, but finetuning LLM in general FL, it may be interesting to investigate the relationship between val/test loss and the final evaluation accuracy. I'm not sure there's been a study on this。 |
When I finetune llama-7b on gsm-8k with different finetuning methods. I compared the test loss and evaluation accuracy of different methods and found that one of the method has smaller test/val loss but lower evaluation accuracy. Is it reasonable?
The text was updated successfully, but these errors were encountered: