Memory Problems #10

ian-adams · 2018-10-15T19:21:28Z

Hello again -

When I am running eda, I get the following error:

J(): 3900 unable to allocate real [29,536870912]
tuples_get_indicators(): - function returned error
tuples(): - function returned error
: - function returned error
r(3900);

Has this come up before? I am using a relatively modern (< 1 year old) Windows laptop with 16 GB RAM, or is this not a memory error?

ian-adams · 2018-10-15T20:42:37Z

Little more info:

I've managed to narrow down the issue, it seems to be throwing the error every time (different, but related datasets) after getting done with the "x^2 Probability Plots for [var]" at the end of the variable list. I don't know what comes next, but every time this is the plot that is left showing, and always for the last variable in Stata's variable list.

wbuchanan · 2018-10-15T20:50:57Z

How many categorical and continuous variables are you using? It is likely a problem with the number of permutations of the variables. The bubble plots will consume the most memory, but with a sufficient number of continuous variables there could be a massive number of possible pairs that can be created.

ian-adams · 2018-10-15T21:14:40Z

Running with the nobubble and noladder options hang at the same error.

The actual dataset is approximately 15 continuous variables and 30 categorical. But if I cut down to a smaller training set of 10 categorical and 3 categorical, and have nobubble, it gets to the correlation heat map, and then gives the following:

(0 observations deleted)
(file C:/Users/adams/Google Drive/STATA Data/Officer Wellness/SLC_eda/1/graphs/edaheatmap.pdf written in PDF format)
matsize must be between 10 and 800
r(198);

I 'query memory' just in case, but matsize is already set to 400. I've noticed it's not creating a batch file at all, not sure if that's related.

ian-adams · 2018-10-15T23:46:09Z

Trying it with the sysuse auto.dta - It manages to get through to the heat map, then fails with the same:

(0 observations deleted)
(file C:/Users/adams/Google Drive/STATA Data/Officer Wellness/eda testing/1/graphs/edaheatmap.pdf
written in PDF format)
matsize must be between 10 and 800
r(198);
end of do-file

As the package begins to run, it gives this note:

note: file C:/Users/adams/Google Drive/STATA Data/Officer Wellness/eda testing/1/autos.tex not found

The "autos.tex" file is created and is there in the correct root folder.

So it seems there's still something going on in the create PDF at the end of the script? This is on a fresh install of all .ado files.

wbuchanan · 2018-10-16T00:31:38Z

The first error message seems like something related to the Stata configuration for matrix sizes. I can’t remember off the top of my head, but you should be able to do something like

set matsize 11000

To address the first issue above. Are you able to share the data you are working with and/or simulate a dataset with roughly the same properties?

ian-adams · 2018-10-16T01:10:21Z

To the first issue, my Stata (IC I think) has a matsize upper limit of 800, but I've moved that to the max.

I can share the data

Even when I'm running the auto sysuse data, I'm getting that *.tex not found issue.

wbuchanan · 2018-10-16T07:36:50Z

The note isn’t an actual problem. It is basically just saying that the replace option under the hood isn’t being used. If you were to run things a second time it shouldn’t display the same message since it would be replacing that file.

ian-adams · 2018-10-16T14:51:35Z

Ah, I see.

So any idea why it's not compiling at the end? I have pdflatex on the system path, and I can get smaller training sets to progress through the package to the edaheat portion - but then it stalls out. The matsize issue is Stata specific, and I can see how with a larger set of variables it could be a problem. But it shouldn't be impacting the compiling of the report, right?

wbuchanan · 2018-10-19T08:42:46Z

Not sure to be honest. I’ll see if I can replicate from a Windows machine later today or over the weekend. It is especially weird if that is happening only when you use some datasets and not others.

wbuchanan · 2019-03-18T09:26:09Z

@iadams78
As an FYI, I recently ran into the memory issue that you hit when doing some work for a colleague. On the compilation issue, there are a couple different things that I started to notice.

On Windows machines the command is adding a . character between makeLaTeX and .bat so the file name ends up looking like makeLaTeX..bat.
Stata seems to have some issues resolving networked harddrive locations mapped as a local drive (e.g., \Some-Network-Location\Directory\SubDirectory mapped to R:)
On OSX the bash script is now running in the user's home directory instead of running in the scope of the directory.

I'm going to create some additional issues to look into this a bit more, but figured this might be helpful/useful for explaining why the compilation thing might look like it isn't working (I thought the same and then realized that the files were all being created in my home directory).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Problems #10

Memory Problems #10

ian-adams commented Oct 15, 2018

ian-adams commented Oct 15, 2018

wbuchanan commented Oct 15, 2018

ian-adams commented Oct 15, 2018 •

edited

Loading

ian-adams commented Oct 15, 2018

wbuchanan commented Oct 16, 2018

ian-adams commented Oct 16, 2018 •

edited

Loading

wbuchanan commented Oct 16, 2018

ian-adams commented Oct 16, 2018

wbuchanan commented Oct 19, 2018

wbuchanan commented Mar 18, 2019

Memory Problems #10

Memory Problems #10

Comments

ian-adams commented Oct 15, 2018

ian-adams commented Oct 15, 2018

wbuchanan commented Oct 15, 2018

ian-adams commented Oct 15, 2018 • edited Loading

ian-adams commented Oct 15, 2018

wbuchanan commented Oct 16, 2018

ian-adams commented Oct 16, 2018 • edited Loading

wbuchanan commented Oct 16, 2018

ian-adams commented Oct 16, 2018

wbuchanan commented Oct 19, 2018

wbuchanan commented Mar 18, 2019

ian-adams commented Oct 15, 2018 •

edited

Loading

ian-adams commented Oct 16, 2018 •

edited

Loading