-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory Problems #10
Comments
Little more info: I've managed to narrow down the issue, it seems to be throwing the error every time (different, but related datasets) after getting done with the "x^2 Probability Plots for [var]" at the end of the variable list. I don't know what comes next, but every time this is the plot that is left showing, and always for the last variable in Stata's variable list. |
How many categorical and continuous variables are you using? It is likely a problem with the number of permutations of the variables. The bubble plots will consume the most memory, but with a sufficient number of continuous variables there could be a massive number of possible pairs that can be created. |
Running with the nobubble and noladder options hang at the same error. The actual dataset is approximately 15 continuous variables and 30 categorical. But if I cut down to a smaller training set of 10 categorical and 3 categorical, and have nobubble, it gets to the correlation heat map, and then gives the following:
I 'query memory' just in case, but matsize is already set to 400. I've noticed it's not creating a batch file at all, not sure if that's related. |
Trying it with the sysuse auto.dta - It manages to get through to the heat map, then fails with the same:
As the package begins to run, it gives this note:
The "autos.tex" file is created and is there in the correct root folder. So it seems there's still something going on in the create PDF at the end of the script? This is on a fresh install of all .ado files. |
The first error message seems like something related to the Stata configuration for matrix sizes. I can’t remember off the top of my head, but you should be able to do something like
To address the first issue above. Are you able to share the data you are working with and/or simulate a dataset with roughly the same properties? |
To the first issue, my Stata (IC I think) has a matsize upper limit of 800, but I've moved that to the max. I can share the data Even when I'm running the auto sysuse data, I'm getting that *.tex not found issue. |
The note isn’t an actual problem. It is basically just saying that the replace option under the hood isn’t being used. If you were to run things a second time it shouldn’t display the same message since it would be replacing that file. |
Ah, I see. So any idea why it's not compiling at the end? I have pdflatex on the system path, and I can get smaller training sets to progress through the package to the edaheat portion - but then it stalls out. The matsize issue is Stata specific, and I can see how with a larger set of variables it could be a problem. But it shouldn't be impacting the compiling of the report, right? |
Not sure to be honest. I’ll see if I can replicate from a Windows machine later today or over the weekend. It is especially weird if that is happening only when you use some datasets and not others. |
@iadams78
I'm going to create some additional issues to look into this a bit more, but figured this might be helpful/useful for explaining why the compilation thing might look like it isn't working (I thought the same and then realized that the files were all being created in my home directory). |
Hello again -
When I am running eda, I get the following error:
J(): 3900 unable to allocate real [29,536870912]
tuples_get_indicators(): - function returned error
tuples(): - function returned error
: - function returned error
r(3900);
Has this come up before? I am using a relatively modern (< 1 year old) Windows laptop with 16 GB RAM, or is this not a memory error?
The text was updated successfully, but these errors were encountered: