-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NOT NULL constraint failed: bayesdb_crosscat_diagnostics.logscore #284
Comments
This is a bug in Crosscat: somehow analyze is returning |
Oh boy just got it again, even after nullifying and ignoring columns of all |
I don't expect any particular One way to try to debug it would be to trap the invalid-operation exception in Crosscat, if code inspection doesn't find it. Can you show your table's M_c? ( It might also be helpful to apply this patch to bayeslite: |
@fsaad, can you take next steps, or provide more info to independently reproduce? |
One output from the suggested assertion: |
probcomp/crosscat@f91197c |
It sounds like there is a bug in Crosscat causing NaN to come out. If we mask the bug in bayeslite, that will make it harder to find the bug in Crosscat. Did you trip over the assert you added in Crosscat, or only the one in bayeslite? |
updating to the latest bayeslite / bdbcontrib, ahead of the releases On Wed, Feb 3, 2016 at 3:55 PM, Belhal KARIMI [email protected]
Gregory Marton[email protected] |
I just tripped this bug in bayeslite 0.1.8. It sounds like that's unexpected. Do you want me to dump any information? |
Fixing this requires finding where Crosscat is getting NaN in the logscore calculation. Last time I tried to track down sources of this symptom I liberally sprinkled Here is an example of a bug that could cause this symptom (a bug which bayeslite's logic to guess statistical types tries to avoid in some cases, so it may not be the bug you're encountering): probcomp/crosscat#85 |
Got it. I was just confused by Grem's comment that "...it's fixed ahead of 0.1.5" Is there a straightforward workaround in the meantime? I could manually ignore the column if it's easy to figure out what column it's choking on, but I don't get much in the ways of debug output. Is that included in the output of that patch? |
Probably Grem was referring to the bayeslite statistical type guessing heuristic to avoid modelling constant columns. Unfortunately, at the time the NaN comes flying out of Crosscat, it is not associated with any particular column. We don't know all the conditions of column values (if they are isolated to individual columns!) under which Crosscat will compute NaN for a logscore. We know at least one condition: a constant column. There are more conditions which have not been fully analyzed, such as #388, which has a lead you might follow about normal sufficient statistics and update_continuous_hypers. |
Of course, if you can discern that you have a constant column and then skip modelling it, that would be the easiest way to work around this! But if you don't obviously have a constant column then I'm afraid it will take some more work. |
@asilversempirical, can you get a backtrace if you recompile crosscat with FP exceptions turned on in State.cpp? |
Performance got so bad after enabling FP exceptions that it couldn't initialize the models. I never got to analyze to try to trigger the issue. |
Thanks for trying it. I'll try it on some of the smaller test cases I ran into on the other thread when I get some time. |
Suspicion this happened because I have some columns of all
None
- but we should be able to do something about that case.The text was updated successfully, but these errors were encountered: