You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Irene asked a question about the splitting of data that is done in Inform_FZBoost to split data to create a separate validation sample used in determining the best bump_thresh and sharpen params. Right now I have things set to train on a fraction trainfrac of the data, and use (1-trainfrac) fraction to compute cde loss values for a grid of bump_thresh and sharpen values. The model returned is the one trained on only trainfrac fraction of the training data. Re-computing the model with the full dataset would result in a better model, but would almost double the runtime of the inform stage. Maybe a compromise would be adding a config option named something like rerun_full that gives the user the option of recomputing the model on the full dataset.
The text was updated successfully, but these errors were encountered:
Irene asked a question about the splitting of data that is done in
Inform_FZBoost
to split data to create a separate validation sample used in determining the best bump_thresh and sharpen params. Right now I have things set to train on a fractiontrainfrac
of the data, and use (1-trainfrac) fraction to compute cde loss values for a grid of bump_thresh and sharpen values. The model returned is the one trained on only trainfrac fraction of the training data. Re-computing the model with the full dataset would result in a better model, but would almost double the runtime of the inform stage. Maybe a compromise would be adding a config option named something likererun_full
that gives the user the option of recomputing the model on the full dataset.The text was updated successfully, but these errors were encountered: