-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
data.table performance regression 1534 #4
Comments
does it work now? |
@tdhock its seems the id i choose for the fix is not it so i'm getting the right fixed id |
so I revisited the commit Id since the one i picked where not the exact SHA,
i run the new commit id and I'm still getting an error message "Error: When i is a data.table (or character vector), the columns to join by must be specified using 'on=' argument (see ?data.table), by keying x (i.e. sorted, and, marked as sorted, see ?setkey), or by sharing column names between x and i (i.e., a natural join). Keyed joins might have further speed benefits on very large data due to x being sorted in RAM." find links to the commit id in the code https://github.com/Rdatatable/data.table/pull/5205/commits - link to the commit |
The code from Rdatatable/data.table#1534 is library(data.table)
dt <- data.table(Grp = rep(seq_len(1e6), each=10L))
dt[, Value := sample(100L, size = .N, replace = TRUE)]
system.time(dt[, PrevValueByGrp := shift(Value, type = "lag"), by = Grp][])
# user system elapsed
# 19.50 0.80 20.34
system.time(dt[, v := shift(Value, type = "lag")][rowid(Grp)==1L, v := NA][])
# user system elapsed
# 1.00 0.87 1.25
dt[, all.equal(v, PrevValueByGrp)]
# [1] TRUE The first timed expression The second timed expression Your atime code has the same issue as the other code we discussed earlier today.
in code above, expr contains both expressions, but should contain only the first, as below:
Triple colon data.table::: prefix is necessary for atime_versions, which will change it into data.table.21abc12891298etc::: based on the versions you specify. |
also you should double check the N values, which seem to be smaller than 1 sometimes, N=10^seq(3,8),
setup={
n <- N/1e6 |
okay, got it |
@tdhock kindly help me to restore this branch #Rdatatable/data.table#1534
I'm trying to reproduce but gives me an error which its seem to me that the branch has been closed
"Error in value[3L] :
Error in revparse_single(object, branch): Error in 'git2r_revparse_single': Requested object could not be found
when trying to checkout 58135017a985f3cc2c6f0d091c4effaec4442f56"
The text was updated successfully, but these errors were encountered: