Skip to content

Release management and revdep checks

Toby Dylan Hocking edited this page Dec 5, 2022 · 23 revisions

Revdep checks

Revdep (reverse dependency) checks are required by CRAN, to ensure that any new version of data.table does not break other CRAN packages that depend on it.

Run on your local machine

If you want to run revdep checks on your local machine, there is some code here: https://github.com/Rdatatable/data.table/blob/master/.dev/revdep.R but that may take a long time if not parallelized (10-20 days).

Interpret results computed on NAU Monsoon

Toby Dylan Hocking @tdhock maintains a revdep check system which publishes the results on web pages linked in this directory https://rcdata.nau.edu/genomic-ml/data.table-revdeps/analyze/ This system runs each of the 1300+ revdep checks in parallel on the NAU Monsoon compute cluster, so we can get all results in only 3-4 hours. Every day at 00:01 MST (1 minute past midnight, Mountain Standard Time) a check is started with current R-release, R-devel, data.table master, and data.table CRAN release. The code that is used is this git repo, https://github.com/tdhock/data.table-revdeps and as of 28 Nov 2022 the checks are on all dependencies ("Depends", "Imports", "LinkingTo", "Suggests", "Enhances"). The top of a typical result web page is shown below. It shows what versions of R and data.table were used for the checks.

image

For each version of R, each revdep is checked with data.table master and release. If there are any differences found in the check results, then there will be a row in the "significant differences" table, example below:

image

The significant differences table is sorted by the first column, which is the first bad commit which git bisect found which causes the problem. So you can easily see if there are any revdeps which may have similar issues (resulting from the same data.table commit/pr). In the example above, there are three packages which each have a new WARNING upon installation. Links are:

  • first.bad.commit: commit on github -- this is useful for determining the commit/PR where the problem started.
  • Package: log file from running the revdep checks on monsoon -- search this log for the new bad check to see additional details.
  • CRAN: current check results on CRAN using data.table release on a linux machine, for comparison (hopefully should be same as release column which was computed on Monsoon).

Steps to report a new revdep check problem:

  • First search for the package name in the data table issue tracker, https://github.com/Rdatatable/data.table/issues to make sure there is no existing issue already.
  • Create a new issue with at least (1) a brief description of the problem, (2) how to reproduce it, and (3) a link to the commit/PR where git bisect says the problem started happening (first.bad.commit column).
  • Optionally, add (4) @mentions to people who authored the commit/PR where the problem started happening, and (5) a minimal reproducible example. (sometimes it is not easy to create a MRE, but if you can then it would likely be useful as a test case for data.table)
  • Example with minimal info and a mention: https://github.com/Rdatatable/data.table/issues/5544
  • Example with more info/analysis and a minimal reproducible example: https://github.com/Rdatatable/data.table/issues/5536
  • Also make sure that the issue/difference is real, by looking to see (1) if it was found in other recent checks (for example, the previous day), (2) if it occurs in both R-devel and R-release, (3) if result for data.table release equals result from CRAN, and (4) if git bisect found a non-trivial commit (trivial is when parent=same as git bisect old, as in fplot below).

image