Skip to content

Commit

Permalink
Progress towards #2572 (miscellaneous tests) (#2605)
Browse files Browse the repository at this point in the history
  • Loading branch information
MichaelChirico authored and mattdowle committed Feb 22, 2018
1 parent 5174a8c commit e2386ab
Show file tree
Hide file tree
Showing 2 changed files with 58 additions and 34 deletions.
66 changes: 33 additions & 33 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -3077,8 +3077,7 @@ test(1042, DT[-5, mean(x), by = group], data.table(group=c(1,2), V1=c(1.5, 3.5))
# Test when abs(negative index) > nrow(dt) - should warn
test(1042.1, DT[-10], DT, warning="Item 1 of i is -10 but there are only 5 rows. Ignoring this and 0 more like it out of 1.")
test(1042.2, DT[c(-5, -10), mean(x), by = group], data.table(group=c(1,2),V1=c(1.5,3.5)), warning="Item 2 of i is -10 but there are only 5 rows. Ignoring this and 0 more like it out of 2.")
# Test #1043 TO DO - mixed negatives
test(1043, DT[c(1, -5)], error="Item 2 of i is -5 and item 1 is 1. Cannot mix positives and negatives.")
test(1043, DT[c(1, -5)], error="Cannot mix positives and negatives.")

# crash (floating point exception), when assigning null data.table() to multiple cols, #4731
DT = data.table(x=1:5,y=6:10)
Expand Down Expand Up @@ -11693,48 +11692,49 @@ test(1877.3, attr(setattr(data.table(x = 1:10), "test", character()), "test"), c
# In dev 1.10.5 these were parsed as floats, #2625. Caught before release to CRAN.
test(1878, fread("A,B,C,D,E\n.,+.,.e,.e+,0e\n"), data.table(A=".", B="+.", C=".e", D=".e+", E="0e"))

##########################

# TODO: Tests involving GForce functions needs to be run with optimisation level 1 and 2, so that both functions are tested all the time.

# TO DO: Add test for fixed bug #5519 - dcast returned error when a package imported data.table, but dint happen when "depends" on data.table. This is fixed (commit 1263 v1.9.3), but not sure how to add test.

# TO DO: test and highlight in docs that negatives are fine and fast in forderv (ref R wish #15644)
# TO DO: tests of freading classes like Date and the verbose messages there.
# TO DO: Test mid read bump of logical T/F to character, collapse back to T and F.

# TO DO: add examples of multiple LHS (name and position) and multiple RHS to example(":=")
# TO DO: tests on double in add hoc by
# TO DO: test on -i that retain key e.g. DT[-4] and DT[-4,sum(v),by=b] should both retain key
# test on out of bound i subsets e.g. 6:10 when DT has 7 rows, and mixed negative and positive i integer is error.
# test that ordered subsets when i is unkeyed now retain x's key (using is.sorted(f__))

# TO DO: add FAQ that eval() is evaled in calling frame so don't need a, then update SO question of 14 March. See the test using variable name same as column name. Actually, is that true? Need "..J".
# TO DO: why did SO answer using eval twice in j need .SD in lapply(f,eval,.SD) on 19 Apr
# assortment of tests from #2572
## negative indexing should retain key
DT = data.table(a = c(5, 5, 7, 2, 2),
b = 1:5, key = 'a')
test(1879.1, key(DT[-c(2, 3)]), 'a')
test(1879.2, key(DT[-(1:5)]), 'a')
test(1879.3, key(DT[-2, sum(b), by = a]), 'a')
## behavior of out-of-bound subsets
## (mixed +/- already covered in 1043)
test(1879.4, DT[3:6],
data.table(a = c(5, 5, 7, NA),
b = c(1L, 2L, 3L, NA)))
test(1879.5, DT[0:5], DT)
## if fread bumps logical to character,
## the original string representation should be kept
DT = data.table(A=rep("True", 2200), B="FALSE", C='0')
DT[111, LETTERS[1:3] := .("fread", "is", "faithful")]
fwrite(DT, f<-tempfile())
test(1879.6, fread(f, verbose=TRUE), DT,
output=paste("Column 1.*bumped from 'bool8' to 'string'",
"Column 2.*bumped from 'bool8' to 'string'",
"Column 3.*bumped from 'bool8' to 'string'",
sep = '.*'))
unlink(f)

# TO DO: check the "j is named list could be inefficient" message from verbose than Chris N showed recently to 15 May
# TO DO: !make sure explicitly that unnamed lists are being executed by dogroups!
# TO DO: Add to warning about a previous copy that class<-, levels<- can also copy whole vector. *Any* fun<- form basically.
# TO DO: use looped := vs set test in example(":=") or example(setnames) to test overhead in [.data.table is tested to stay low in future.

# TO DO: add tests on smaller examples with NAs for 'frankv', even though can't compare to base::rank.
## See test-* for more tests
###################################
# Add new tests above this line #
###################################

##########################
options(warn=0)
setDTthreads(0)
options(oldalloccol) # set at top of this file

plat = paste("endian==",.Platform$endian,", sizeof(long double)==",.Machine$sizeof.longdouble,
", sizeof(pointer)==",.Machine$sizeof.pointer, sep="")
plat = paste0("endian==", .Platform$endian,
", sizeof(long double)==", .Machine$sizeof.longdouble,
", sizeof(pointer)==", .Machine$sizeof.pointer)
if (nfail > 0) {
if (nfail>1) {s1="s";s2="s: "} else {s1="";s2=" "}
cat("\r")
stop(nfail," error",s1," out of ",ntest, " (lastID=",lastnum,", ",plat, ") in inst/tests/tests.Rraw on ",date(),". Search tests.Rraw for test number",s2,paste(whichfail,collapse=", "),".")
# important to stop() here, so that 'R CMD check' fails
}
cat("\n",plat,"\n\nAll ",ntest," tests in inst/tests/tests.Rraw completed ok in ",timetaken(started.at)," on ",date(),"\n",sep="")
# date() is included so we can tell when CRAN checks were run (in particular if they have been rerun since
# an update to Rdevel itself; data.table doesn't have any other dependency) since there appears to be no other
# way to see the timestamp that CRAN checks were run. Some CRAN machines lag by several days.
# date() is included so we can tell exactly when these tests ran on CRAN. Sometimes a CRAN log can show error but that can be just
# stale due to not updating yet since a fix in R-devel, for example.

26 changes: 25 additions & 1 deletion man/assign.Rd
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ set(x, i = NULL, j, value)
\item{i}{ Optional. Indicates the rows on which the values must be updated with. If not provided, implies \emph{all rows}. The \code{:=} form is more powerful as it allows \emph{subsets} and \code{joins} based add/update columns by reference. See \code{Details}.
In \code{set}, only integer type is allowed in \code{i} indicating which rows \code{value} should be assigned to. \code{NULL} represents all rows more efficiently than creating a vector such as \code{1:nrow(x)}. }
\item{j}{ Column name(s) (character) or number(s) (integer) to be assigned \code{value} when column(s) already exist, and only column name(s) if they are to be added newly. }
\item{j}{ Column name(s) (character) or number(s) (integer) to be assigned \code{value} when column(s) already exist, and only column name(s) if they are to be created. }
\item{value}{ A list of replacement values to assign by reference to \code{x[i, j]}. }
}
\details{
Expand Down Expand Up @@ -100,6 +100,30 @@ setkey(DT, a)
DT["A", b := 0L] # binary search for group "A" and set column b using keys
DT["B", f := mean(d)] # subassign to new column, NA initialized
# Adding multiple columns
## by name
DT[ , c('sin_d', 'log_e', 'cos_d') :=
.(sin(d), log(e), cos(d))]
## by patterned name
DT[ , paste(c('sin', 'cos'), 'b', sep = '_') :=
.(sin(b), cos(b))]
## using lapply & .SD
DT[ , paste0('tan_', c('b', 'd', 'e')) :=
lapply(.SD, tan), .SDcols = c('b', 'd', 'e')]
## using forced evaluation to disambguate a vector of names
## and overwrite existing columns with their squares
sq_cols = c('b', 'd', 'e')
DT[ , (sq_cols) := lapply(.SD, `^`, 2L), .SDcols = sq_cols]
## by integer (NB: for robustness, it is not recommended
## to use explicit integers to update/define columns)
DT[ , c(2L, 3L, 4L) := .(sqrt(b), sqrt(d), sqrt(e))]
## by implicit integer
DT[ , grep('a$', names(DT)) := tolower(a)]
## by implicit integer, using forced evaluation
sq_col_idx = grep('d$', names(DT))
DT[ , (sq_col_idx) := lapply(.SD, dnorm),
.SDcols = sq_col_idx]
\dontrun{
# Speed example ...
Expand Down

0 comments on commit e2386ab

Please sign in to comment.