Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bql_utils.analyze should check for existing models before creating new ones. #142

Open
alxempirical opened this issue May 13, 2016 · 4 comments

Comments

@alxempirical
Copy link
Contributor

I think a better default here would be models=0. Model creation should be a separate step. Otherwise, it's too easy for someone to run analyze twice and end up with models which have been trained for substantially different numbers of iterations, which is likely to lead to bad results.

@gregory-marton
Copy link
Contributor

I'm not sure that it's true that models with substantially different numbers of iterations lead to bad results. In fact, it has been suggested that it's not at all unreasonable to intentionally increase the number of models stepwise with iterations to always have a few that are exploring more or less afresh.

That said, this whole interface is likely to change or go away very soon in favor of a more explicit MML interaction, after http://tinyurl.com/probcomp-bql-mml-split

@alxempirical
Copy link
Contributor Author

alxempirical commented May 13, 2016

The risk is that someone calls p.analyze(iterations=1000), then calls p.analyze(iterations=10) or something, and then all subsequent results are silently polluted by half the models being badly under-trained.

Using models with different numbers of training iterations might be OK, if you have assessed the convergence rate and have a rough idea how much training new models will need.

Thanks for the pointer to the new doc. Do you mean the Population interface in general (in which case I should stop dog-fooding it [not that it's dog food]), or just this analyze interface?

@gregory-marton
Copy link
Contributor

I believe we will continue to have some version of the Population interface because it's handy for plotting and other utilities, but I know for a fact that .analyze would be better written explicitly in MML, and perhaps there should be a .quick_analyze if you don't want to think too hard about it (which is perhaps what this should have been called in the first place).

@gregory-marton
Copy link
Contributor

And indeed the creation and initialization of the GPMs for the population via any metamodels would also be part of what is better done via MML rather than in .initialize, though again, there might usefully be a version of that in .quick_analyze.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants