Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load_RUIAN_state download error #22

Open
JanPintera opened this issue Jun 29, 2021 · 5 comments
Open

load_RUIAN_state download error #22

JanPintera opened this issue Jun 29, 2021 · 5 comments

Comments

@JanPintera
Copy link

Running load_RUIAN_state("orp") gives:

Error in gzfile(file, mode) : cannot open the connection
In addition: Warning message:
In gzfile(file, mode) :
  cannot open compressed file '/var/folders/jv/49b_7nv12xd7_mts18ps53ym0000gn/T//Rtmpq9O8mh/../.CzechDataPackageCache/175e846e7cb18757', probable reason 'No such file or directory'

The path doesn't seem to exist on my machine.

@JanCaha
Copy link
Owner

JanCaha commented Jul 3, 2021

This seems weird, the data cache points to very strange directory - thus the strange path.

Could you take a look what these two snippets return?

print(tempdir())
path <- file.path(tempdir(), "..", ".CzechDataPackageCache")
path <- normalizePath(path)
print(path)

@petrbouchal
Copy link
Contributor

Just to add: am seeing the same thing every now and then, and always forget how to recover from it other than reinstall the package. The path is very similar on my machine (mine is MacOS and so is @JanPintera's I suspect).

Snippet output:

print(tempdir())
#> [1] "/var/folders/c8/pj33jytj233g8vr0tw4b2h7m0000gn/T//RtmpVjsY1h"
path <- file.path(tempdir(), "..", ".CzechDataPackageCache")
path <- normalizePath(path)
print(path)
#> [1] "/private/var/folders/c8/pj33jytj233g8vr0tw4b2h7m0000gn/T/.CzechDataPackageCache"

This dir even exists after I have run load_RUIAN_state() and got the above error. The problem seems to be that the memoisation logic is not picking up this path. The fact that the memoisation temp dir path returned by the error message (which a quick look at the debugger suggests is produced by m_GET()) contains ".." suggests that something is going on with the path normalisation?

On a related note, I notice the memoisation sometimes has no effect across sessions, i.e. I still see the "Downloading roughly NNN MB..." message.

@JanCaha
Copy link
Owner

JanCaha commented Dec 10, 2021

It is the "every now and then" part that makes it really confusing. It also seems to be bound to macOS as I never saw this on Windows or Linux.

I tried to fix couple of things. The paths are now handled using fs package, which might help. The cache should be cleared from old files on every package attach (this will unlikely have an effect, but it seems as something reasonable to do). I also realized that there actually were two steps of the caching, one that worked in current session and one that worked even cross sessions. I got rid of the session only part as the caching should primary work across sessions. The cross session caching should also cover caching in single session.

This is only implemented in load_cadastral_territory() function for now. The function will let you know if it is downloading data or using cached version. Would you mind trying it a bit to see if we can figure out, what is going on?

@petrbouchal
Copy link
Contributor

Many thanks! Sorry I wasn't able to provide a reproducible example. The only other potentially relevant factor is that @JanPintera and I encountered the error in {renv}-managed projects. This means the package lives in a system-wide cache and is symlinked into a project level library. Not sure if that could affect the caching by {memoise}.

I will keep testing this in different scenarios and will report back. At a quick try, it seems to work fine (and clearly does always use the cached data across sessions, which is great.

@JanCaha
Copy link
Owner

JanCaha commented Dec 10, 2021

That probably should not have an effect. The cached data is stored in temp directory, which should be the same with or without renv (hopefully).

You will see, if you try to load the data again later from another R session. Withing 7 days (default cache length) the data should still be read from cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants