Skip to content

Commit

Permalink
#3401 segfault fix (#3404)
Browse files Browse the repository at this point in the history
* Fix segfault issue, #3401

* Need to Free().
  • Loading branch information
arunsrinivasan authored Feb 16, 2019
1 parent bf0047f commit 161ac2c
Show file tree
Hide file tree
Showing 3 changed files with 25 additions and 3 deletions.
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@

6. Several issues were filed regarding limitations of `dcast.data.table` in handling `fun.aggregate` argument when the functions are not directly provided to the argument as `fun.aggregate <- list(sum, mean)` and instead are stored in a variable, e.g., `funs <- list(sum, mean)` and referred to as `fun.aggregate=funs`. This fix closes several issues [#1974](https://github.com/Rdatatable/data.table/issues/1974), [#1369](https://github.com/Rdatatable/data.table/issues/1369), [#2064](https://github.com/Rdatatable/data.table/issues/2064) and [#2949](https://github.com/Rdatatable/data.table/issues/2949). Thanks to @sunbee, @Ping2016, @smidelius and @d0rg0ld for reporting.

7. Fixed a minor regression in v1.12.0 that resulted in segfaults on some non-equi joins. Closes [#3401](https://github.com/Rdatatable/data.table/issues/3401). Thanks to @Gayyam for the bug report.

#### NOTES

1. When upgrading to 1.12.0 some Windows users might have seen `CdllVersion not found` in some circumstances. We found a way to catch that so the [helpful message](https://twitter.com/MattDowle/status/1084528873549705217) now occurs for those upgrading from versions prior to 1.12.0 too, as well as those upgrading from 1.12.0 to a later version. See item 1 in notes section of 1.12.0 below for more background.
Expand Down
23 changes: 21 additions & 2 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -13443,12 +13443,12 @@ myFun2 <- function(data, vars) {
}
funs = list(sum, mean)
vars = list("d1", "d2")

test(1987.1, names(dcast.data.table(dt, x + y ~ z, fun=funs, value.var=vars)),
c("x", "y", "d1_fun1_a", "d1_fun1_b", "d2_fun2_a", "d2_fun2_b"))
test(1987.2, dcast.data.table(dt, x + y ~ z, fun=sum, value.var=vars[[1]]),
myFun1(dt, vars[[1]]))
test(1987.3, dcast.data.table(dt, x + y ~ z, fun=list(f1=sum, first=function(x) x[1L]), value.var=vars),
myFun2(dt, vars))
test(1987.3, dcast.data.table(dt, x + y ~ z, fun=list(f1=sum, first=function(x) x[1L]), value.var=vars), myFun2(dt, vars))

# testing frankv behavior with NA/NaN; earlier tests compare consistency with base::rank,
# but we intentionally break from base w.r.t. ranking NAs (we consider NAs to be tied, ditto NaN)
Expand All @@ -13458,6 +13458,25 @@ test(1988.2, frankv(x, cols='r', order=1L, ties.method='max'), c(5L, 3L, 2L,
test(1988.3, frankv(x, cols='r', order=1L, ties.method='min'), c(5L, 3L, 2L, 10L, 1L, 8L, 4L, 8L, 6L, 7L, 10L))
test(1988.4, frankv(x, cols='r', order=1L, ties.method='dense'), c(5L, 3L, 2L, 9L, 1L, 8L, 4L, 8L, 6L, 7L, 9L))

# Test should not segfault, #3401 fix:
set.seed(1L)
foo <- function(n) apply(matrix(sample(letters, 4*n, TRUE), ncol=4L), 1, paste, collapse="")
dates <- sample(as.Date("2015-01-01"), as.Date("2018-12-31"), 300L)
codes <- foo(2000)
DT1 <- data.table(
date=sample(dates[1:100], 1e4, TRUE),
flight=sample(codes[1:1300], 1e4, TRUE),
val=runif(1e4)
)

DT2 <- data.table(
start=sample(dates[80:250], 1e4, TRUE),
end=sample(dates[100:300], 1e4, TRUE),
flight=sample(codes[1200:2000], 1e4, TRUE)
)
DT2[, c("start", "end") := .(pmin(start, end), pmax(start, end))]
# just testing if the code runs without segfault ..
test(1989.1, nrow(DT1[DT2, on=.(date <= end, date >= start, flight==flight)]) > 0L, TRUE)

###################################
# Add new tests above this line #
Expand Down
3 changes: 2 additions & 1 deletion src/uniqlist.c
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@ SEXP nestedid(SEXP l, SEXP cols, SEXP order, SEXP grps, SEXP resetvals, SEXP mul
R_len_t nrows = length(VECTOR_ELT(l,0)), ncols = length(cols);
if (nrows==0) return(allocVector(INTSXP, 0));
R_len_t thisi, previ, ansgrpsize=1000, nansgrp=0;
R_len_t *ansgrp = (R_len_t *)R_alloc(ansgrpsize, sizeof(R_len_t)), starts, grplen;
R_len_t *ansgrp = Calloc(ansgrpsize, R_len_t), starts, grplen; // #3401 fix. Needs to be Calloc due to Realloc below .. else segfaults.
R_len_t ngrps = length(grps);
bool *i64 = (bool *)R_alloc(ncols, sizeof(bool));
if (ngrps==0) error("Internal error: nrows[%d]>0 but ngrps==0", nrows); // # nocov
Expand Down Expand Up @@ -298,6 +298,7 @@ SEXP nestedid(SEXP l, SEXP cols, SEXP order, SEXP grps, SEXP resetvals, SEXP mul
}
ansgrp[tmp] = thisi;
}
Free(ansgrp);
UNPROTECT(1);
return(ans);
}
Expand Down

0 comments on commit 161ac2c

Please sign in to comment.