You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We had a very good discussion with Uwe. Uwe and I are going to collaborate on implementing this feature. I will keep the issue updated. @bnaras can you put the notes you took during the meeting in a comment here?
installed.packages() takes a long time to execute the first time in a session when a large number of packages are installed in a library. (The subsequent invocations are fast because of caching.)
Impact
The issue is acute in settings where library is shared via network mounted drives, as is not uncommon for educational labs etc. In Windows installations, even with < 100 packages, the function takes 2 seconds or more on a (reasonably powerful) machine as Uwe demonstrated. This is also a problem for an Rstudio user because, upon startup, Rstudio seeks to ascertain all installed packages making it unusable in a networked shared library setting.
Core Issue
The time it takes for installed.packages() is dominated by the time to read every DESCRIPTION file in all the installed packages.
Proposal
Maintain an up-to-date database---we use the term loosely, for now---of installed packages so that the information is readily available for installed.packages() to epxploit.
Desiderata are:
Ensuring integrity
Caching and synchronization/invalidation/rebuilding as needed
Ensure it works with parallel installation processes (arg: Ncpus > 1). The parallel installation process already calculates dependencies and puts most important dependencies first
Initial Approach
Figure out a mechanism to build up a serialized R object such as PACKAGES.RDS that reflects what's actually installed.
Allow for an environmental variable to be set that enables keeping the object up-to-date by default, i.e. rebuilding the database if packages are installed or uninstalled. This may be mostly used by system administrators to keep things up-to-date automatically, but so may users if they so desire.
As described in Uwe's talk in the kick-off session on Day 1: create a database for each library of installed R packages.
This can help to speed up functions that check which packages are installed.
The text was updated successfully, but these errors were encountered: