Model returned by inform stage only uses fraction of training data in final model #28

sschmidt23 · 2023-05-22T18:00:49Z

Irene asked a question about the splitting of data that is done in Inform_FZBoost to split data to create a separate validation sample used in determining the best bump_thresh and sharpen params. Right now I have things set to train on a fraction trainfrac of the data, and use (1-trainfrac) fraction to compute cde loss values for a grid of bump_thresh and sharpen values. The model returned is the one trained on only trainfrac fraction of the training data. Re-computing the model with the full dataset would result in a better model, but would almost double the runtime of the inform stage. Maybe a compromise would be adding a config option named something like rerun_full that gives the user the option of recomputing the model on the full dataset.

The text was updated successfully, but these errors were encountered:

sschmidt23 self-assigned this May 22, 2023

sschmidt23 added the enhancement New feature or request label May 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model returned by inform stage only uses fraction of training data in final model #28

Model returned by inform stage only uses fraction of training data in final model #28

sschmidt23 commented May 22, 2023

Model returned by inform stage only uses fraction of training data in final model #28

Model returned by inform stage only uses fraction of training data in final model #28

Comments

sschmidt23 commented May 22, 2023