Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a GitHub Action to check for performance regressions within the incoming changes from PRs, along with two test cases based on historical regressions #6078

Merged
merged 15 commits into from
Apr 16, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions .github/workflows/performance-tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
name: Autocomment atime-based performance regression analysis on PRs

on:
pull_request:
branches:
- '*'
types:
- opened
- reopened
- synchronize

jobs:
comment:
runs-on: ubuntu-latest
container: ghcr.io/iterative/cml:0-dvc2-base1
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
repo_token: ${{ secrets.GITHUB_TOKEN }}
R_KEEP_PKG_SOURCE: yes
Anirban166 marked this conversation as resolved.
Show resolved Hide resolved
steps:
- uses: Anirban166/[email protected]
69 changes: 69 additions & 0 deletions inst/atime/tests.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
pkg.edit.fun = function(old.Package, new.Package, sha, new.pkg.path) {
Anirban166 marked this conversation as resolved.
Show resolved Hide resolved
pkg_find_replace <- function(glob, FIND, REPLACE) {
atime::glob_find_replace(file.path(new.pkg.path, glob), FIND, REPLACE)
}
Package_regex <- gsub(".", "_?", old.Package, fixed = TRUE)
Package_ <- gsub(".", "_", old.Package, fixed = TRUE)
new.Package_ <- paste0(Package_, "_", sha)
pkg_find_replace(
"DESCRIPTION",
paste0("Package:\\s+", old.Package),
paste("Package:", new.Package))
pkg_find_replace(
file.path("src", "Makevars.*in"),
Package_regex,
new.Package_)
pkg_find_replace(
file.path("R", "onLoad.R"),
Package_regex,
new.Package_)
pkg_find_replace(
file.path("R", "onLoad.R"),
sprintf('packageVersion\\("%s"\\)', old.Package),
sprintf('packageVersion\\("%s"\\)', new.Package))
pkg_find_replace(
file.path("src", "init.c"),
paste0("R_init_", Package_regex),
paste0("R_init_", gsub("[.]", "_", new.Package_)))
pkg_find_replace(
"NAMESPACE",
sprintf('useDynLib\\("?%s"?', Package_regex),
paste0('useDynLib(', new.Package_))
}

test.list <- list(
Anirban166 marked this conversation as resolved.
Show resolved Hide resolved
# Performance regression discussed in: https://github.com/Rdatatable/data.table/issues/4311
# Fixed in: https://github.com/Rdatatable/data.table/pull/4440
"Test regression fixed in #4440" = list(
pkg.edit.fun = pkg.edit.fun,
N = 10^seq(3,8),
setup = quote({
set.seed(1L)
dt <- data.table(a = sample(N, N))
Anirban166 marked this conversation as resolved.
Show resolved Hide resolved
setindex(dt, a)
Anirban166 marked this conversation as resolved.
Show resolved Hide resolved
}),
expr = quote(data.table:::shallow(dt)),
Before = "9d3b9202fddb980345025a4f6ac451ed26a423be", # This needs to be changed later. Currently assigned to the merge commit in the PR that fixed the regression (https://github.com/Rdatatable/data.table/pull/4440) as the source of regression (or the particular commit that led to it) is not clear. In addition, older versions of data.table are having problems when being installed in this manner. (This includes commits from before Mar 20, 2020 or when the issue that discovered or first mentioned the regression was created)
Anirban166 marked this conversation as resolved.
Show resolved Hide resolved
Regression = "b1b1832b0d2d4032b46477d9fe6efb15006664f4", # Parent of the first commit (https://github.com/Rdatatable/data.table/commit/0f0e7127b880df8459b0ed064dc841acd22f5b73) in the PR (https://github.com/Rdatatable/data.table/pull/4440/commits) that fixes the regression
Fixed = "769f02c6fbbb031391a79f46c6042de99f1ea712"), # Last commit in the PR that fixed the regression (https://github.com/Rdatatable/data.table/pull/4440/commits)

# Test based on: https://github.com/Rdatatable/data.table/issues/5424
# Performance regression introduced from a commit in: https://github.com/Rdatatable/data.table/pull/4491
# Fixed in: https://github.com/Rdatatable/data.table/pull/5463
"Test regression fixed in #5463" = list(
pkg.edit.fun = pkg.edit.fun,
N = 10^seq(3, 8),
expr = quote(data.table:::`[.data.table`(dt_mod, , N := .N, by = g)),
Anirban166 marked this conversation as resolved.
Show resolved Hide resolved
setup = quote({
n <- N/100
set.seed(1L)
Anirban166 marked this conversation as resolved.
Show resolved Hide resolved
dt <- data.table(
g = sample(seq_len(n), N, TRUE),
x = runif(N),
key = "g")
dt_mod <- copy(dt)
}),
Before = "19b7866112614db53eb3e909c097407d91cd6738", # Parent of the regression commit (https://github.com/Rdatatable/data.table/commit/0895fa247afcf6b38044bd5f56c0d209691ddb31), which is the parent of the first commit in the PR that causes the issue (https://github.com/Rdatatable/data.table/pull/5493/commits)
Anirban166 marked this conversation as resolved.
Show resolved Hide resolved
Regression = "0895fa247afcf6b38044bd5f56c0d209691ddb31", # The regression commit is the parent of the first commit in the PR that fixed the issue (https://github.com/Rdatatable/data.table/pull/5493/commits)
Anirban166 marked this conversation as resolved.
Show resolved Hide resolved
Fixed = "1e03fe7b890e63da9651d997ea52548c90b3ae32") # Last commit in the PR that fixed the regression (https://github.com/Rdatatable/data.table/pull/5493/commits)
)
Loading