-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rbind in 1.14.3 doesn't like POSIX #5309
Comments
Some git bisecting suggests commit: 4922384 is where this got introduced. |
Combining rbindlist(list(data.table(a=NA), data.table(a=as.POSIXct("2021-01-01"))))
#> Error in rbindlist(list(data.table(a = NA), data.table(a = as.POSIXct("2021-01-01")))) :
#> Class attribute on column 1 of item 2 does not match with column 1 of item 1. I stumbled across this issue, too, and my workaround was to temporarily convert sessionInfo()
#> R version 4.1.2 (2021-11-01)
#> Platform: aarch64-apple-darwin21.1.0 (64-bit)
#> Running under: macOS Monterey 12.2
#>
#> Matrix products: default
#> BLAS: /opt/homebrew/Cellar/openblas/0.3.19/lib/libopenblasp-r0.3.19.dylib
#> LAPACK: /opt/homebrew/Cellar/r/4.1.2/lib/R/lib/libRlapack.dylib
#>
#> locale:
#> [1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] data.table_1.14.2
#>
#> loaded via a namespace (and not attached):
#> [1] compiler_4.1.2 |
Can confirm this happens with both
|
I've run into this issue with IDates. I note that while rbind fails on both versions 1.14.2 and 1.14.3 with the toy data below, the merge is actually successful on 1.14.2 but not 1.14.3. I haven't exhaustively looked into this, but reverting the change made to rbindlist.c in #5263 at least makes the merge work. I think this is probably the same issue as #5391 . 1.14.3 (neither merge nor rbind work with IDate)library(data.table)
item1 <- data.table(col1 = c(1,2,3,4),
col2_x = c(as.IDate("2016-01-01"),
as.IDate("2016-01-02"),
as.IDate("2016-01-03"),
as.IDate("2016-01-04")),
col2_y = c(NA, NA, NA, NA))
item2 <- data.table(col1 = c(5,6,7,8),
col2_x = c(NA, NA, NA, NA),
col2_y = c("p", "q", "r", "s"))
item3 <- data.table(col1 = c(1,2,3,4),
col2_x = c("2016-01-01",
"2016-01-02",
"2016-01-03",
"2016-01-04"),
col2_y = c(NA, NA, NA, NA))
rbind(item1, item2)
#> Error in rbindlist(l, use.names, fill, idcol): Class attribute on column 2 of item 2 does not match with column 2 of item 1.
rbind(item3, item2)
#> col1 col2_x col2_y
#> <num> <char> <char>
#> 1: 1 2016-01-01 <NA>
#> 2: 2 2016-01-02 <NA>
#> 3: 3 2016-01-03 <NA>
#> 4: 4 2016-01-04 <NA>
#> 5: 5 <NA> p
#> 6: 6 <NA> q
#> 7: 7 <NA> r
#> 8: 8 <NA> s
item1_merge <- data.table(col1 = c(1,2,3,4),
col2 = c(as.IDate("2016-01-01"),
as.IDate("2016-01-02"),
as.IDate("2016-01-03"),
as.IDate("2016-01-04")))
item2_merge <- data.table(col1 = c(5,6,7,8),
col2 = c(NA, NA, NA, NA))
merge(x = item1_merge,
y = item2_merge,
by = "col1",
all = T)
#> Error in rbindlist(l, use.names, fill, idcol): Class attribute on column 3 of item 2 does not match with column 3 of item 1.
sessionInfo()
#> R version 4.1.3 (2022-03-10)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19044)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=English_United States.1252
#> [2] LC_CTYPE=English_United States.1252
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] data.table_1.14.3
#>
#> loaded via a namespace (and not attached):
#> [1] rstudioapi_0.13 knitr_1.38 magrittr_2.0.2 R.cache_0.15.0
#> [5] rlang_1.0.2 fastmap_1.1.0 fansi_1.0.3 stringr_1.4.0
#> [9] styler_1.7.0 highr_0.9 tools_4.1.3 xfun_0.30
#> [13] R.oo_1.24.0 utf8_1.2.2 cli_3.2.0 withr_2.5.0
#> [17] htmltools_0.5.2 ellipsis_0.3.2 yaml_2.3.5 digest_0.6.29
#> [21] tibble_3.1.6 lifecycle_1.0.1 crayon_1.5.1 purrr_0.3.4
#> [25] R.utils_2.11.0 vctrs_0.3.8 fs_1.5.2 glue_1.6.2
#> [29] evaluate_0.15 rmarkdown_2.13 reprex_2.0.1 stringi_1.7.6
#> [33] compiler_4.1.3 pillar_1.7.0 R.methodsS3_1.8.1 pkgconfig_2.0.3 Created on 2022-08-22 by the reprex package (v2.0.1) R 1.14.2 (merge works, rbind doesn't)library(data.table)
item1 <- data.table(col1 = c(1,2,3,4),
col2_x = c(as.IDate("2016-01-01"),
as.IDate("2016-01-02"),
as.IDate("2016-01-03"),
as.IDate("2016-01-04")),
col2_y = c(NA, NA, NA, NA))
item2 <- data.table(col1 = c(5,6,7,8),
col2_x = c(NA, NA, NA, NA),
col2_y = c("p", "q", "r", "s"))
item3 <- data.table(col1 = c(1,2,3,4),
col2_x = c("2016-01-01",
"2016-01-02",
"2016-01-03",
"2016-01-04"),
col2_y = c(NA, NA, NA, NA))
rbind(item1, item2)
#> Error in rbindlist(l, use.names, fill, idcol): Class attribute on column 2 of item 2 does not match with column 2 of item 1.
rbind(item3, item2)
#> col1 col2_x col2_y
#> 1: 1 2016-01-01 <NA>
#> 2: 2 2016-01-02 <NA>
#> 3: 3 2016-01-03 <NA>
#> 4: 4 2016-01-04 <NA>
#> 5: 5 <NA> p
#> 6: 6 <NA> q
#> 7: 7 <NA> r
#> 8: 8 <NA> s
item1_merge <- data.table(col1 = c(1,2,3,4),
col2 = c(as.IDate("2016-01-01"),
as.IDate("2016-01-02"),
as.IDate("2016-01-03"),
as.IDate("2016-01-04")))
item2_merge <- data.table(col1 = c(5,6,7,8),
col2 = c(NA, NA, NA, NA))
merge(x = item1_merge,
y = item2_merge,
by = "col1",
all = T)
#> col1 col2.x col2.y
#> 1: 1 2016-01-01 NA
#> 2: 2 2016-01-02 NA
#> 3: 3 2016-01-03 NA
#> 4: 4 2016-01-04 NA
#> 5: 5 <NA> NA
#> 6: 6 <NA> NA
#> 7: 7 <NA> NA
#> 8: 8 <NA> NA
sessionInfo()
#> R version 4.1.3 (2022-03-10)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19044)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=English_United States.1252
#> [2] LC_CTYPE=English_United States.1252
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] data.table_1.14.2
#>
#> loaded via a namespace (and not attached):
#> [1] rstudioapi_0.13 knitr_1.38 magrittr_2.0.2 R.cache_0.15.0
#> [5] rlang_1.0.2 fastmap_1.1.0 fansi_1.0.3 stringr_1.4.0
#> [9] styler_1.7.0 highr_0.9 tools_4.1.3 xfun_0.30
#> [13] R.oo_1.24.0 utf8_1.2.2 cli_3.2.0 withr_2.5.0
#> [17] htmltools_0.5.2 ellipsis_0.3.2 yaml_2.3.5 digest_0.6.29
#> [21] tibble_3.1.6 lifecycle_1.0.1 crayon_1.5.1 purrr_0.3.4
#> [25] R.utils_2.11.0 vctrs_0.3.8 fs_1.5.2 glue_1.6.2
#> [29] evaluate_0.15 rmarkdown_2.13 reprex_2.0.1 stringi_1.7.6
#> [33] compiler_4.1.3 pillar_1.7.0 R.methodsS3_1.8.1 pkgconfig_2.0.3 Created on 2022-08-22 by the reprex package (v2.0.1) R 1.14.3 but with change to rbindlist.c revertedlibrary(data.table)
item1 <- data.table(col1 = c(1,2,3,4),
col2_x = c(as.IDate("2016-01-01"),
as.IDate("2016-01-02"),
as.IDate("2016-01-03"),
as.IDate("2016-01-04")),
col2_y = c(NA, NA, NA, NA))
item2 <- data.table(col1 = c(5,6,7,8),
col2_x = c(NA, NA, NA, NA),
col2_y = c("p", "q", "r", "s"))
item3 <- data.table(col1 = c(1,2,3,4),
col2_x = c("2016-01-01",
"2016-01-02",
"2016-01-03",
"2016-01-04"),
col2_y = c(NA, NA, NA, NA))
rbind(item1, item2)
#> Error in rbindlist(l, use.names, fill, idcol): Class attribute on column 2 of item 2 does not match with column 2 of item 1.
rbind(item3, item2)
#> col1 col2_x col2_y
#> <num> <char> <char>
#> 1: 1 2016-01-01 <NA>
#> 2: 2 2016-01-02 <NA>
#> 3: 3 2016-01-03 <NA>
#> 4: 4 2016-01-04 <NA>
#> 5: 5 <NA> p
#> 6: 6 <NA> q
#> 7: 7 <NA> r
#> 8: 8 <NA> s
item1_merge <- data.table(col1 = c(1,2,3,4),
col2 = c(as.IDate("2016-01-01"),
as.IDate("2016-01-02"),
as.IDate("2016-01-03"),
as.IDate("2016-01-04")))
item2_merge <- data.table(col1 = c(5,6,7,8),
col2 = c(NA, NA, NA, NA))
merge(x = item1_merge,
y = item2_merge,
by = "col1",
all = T)
#> Warning in rbindlist(l, use.names, fill, idcol): use.names= cannot be FALSE when
#> fill is TRUE. Setting use.names=TRUE.
#> Key: <col1>
#> col1 col2.x col2.y
#> <num> <IDat> <lgcl>
#> 1: 1 2016-01-01 NA
#> 2: 2 2016-01-02 NA
#> 3: 3 2016-01-03 NA
#> 4: 4 2016-01-04 NA
#> 5: 5 <NA> NA
#> 6: 6 <NA> NA
#> 7: 7 <NA> NA
#> 8: 8 <NA> NA
sessionInfo()
#> R version 4.1.3 (2022-03-10)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19044)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=English_United States.1252
#> [2] LC_CTYPE=English_United States.1252
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] data.table_1.14.3
#>
#> loaded via a namespace (and not attached):
#> [1] rstudioapi_0.13 knitr_1.38 magrittr_2.0.2 R.cache_0.15.0
#> [5] rlang_1.0.2 fastmap_1.1.0 fansi_1.0.3 stringr_1.4.0
#> [9] styler_1.7.0 highr_0.9 tools_4.1.3 xfun_0.30
#> [13] R.oo_1.24.0 utf8_1.2.2 cli_3.2.0 withr_2.5.0
#> [17] htmltools_0.5.2 ellipsis_0.3.2 yaml_2.3.5 digest_0.6.29
#> [21] tibble_3.1.6 lifecycle_1.0.1 crayon_1.5.1 purrr_0.3.4
#> [25] R.utils_2.11.0 vctrs_0.3.8 fs_1.5.2 glue_1.6.2
#> [29] evaluate_0.15 rmarkdown_2.13 reprex_2.0.1 stringi_1.7.6
#> [33] compiler_4.1.3 pillar_1.7.0 R.methodsS3_1.8.1 pkgconfig_2.0.3 Created on 2022-08-22 by the reprex package (v2.0.1) |
Thank you for extra report. We need ensure that goes into unit tests to close this issue. |
BTW, in the current dev release, the order of x and y in the merge command can determine whether this issue is triggered. library(data.table)
item1_merge <- data.table(col1 = c(1,2,3,4),
col2 = c(as.IDate("2016-01-01"),
as.IDate("2016-01-02"),
as.IDate("2016-01-03"),
as.IDate("2016-01-04")))
item2_merge <- data.table(col1 = c(5,6,7,8),
col2 = c(NA, NA, NA, NA))
merge(x = item1_merge,
y = item2_merge,
by = "col1",
all = T)
#> Error in rbindlist(l, use.names, fill, idcol): Class attribute on column 3 of item 2 does not match with column 3 of item 1.
merge(x = item2_merge,
y = item1_merge,
by = "col1",
all = T)
#> Key: <col1>
#> col1 col2.x col2.y
#> <num> <lgcl> <IDat>
#> 1: 1 NA 2016-01-01
#> 2: 2 NA 2016-01-02
#> 3: 3 NA 2016-01-03
#> 4: 4 NA 2016-01-04
#> 5: 5 NA <NA>
#> 6: 6 NA <NA>
#> 7: 7 NA <NA>
#> 8: 8 NA <NA>
sessionInfo()
#> R version 4.1.3 (2022-03-10)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19044)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=English_United States.1252
#> [2] LC_CTYPE=English_United States.1252
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] data.table_1.14.3
#>
#> loaded via a namespace (and not attached):
#> [1] rstudioapi_0.13 knitr_1.38 magrittr_2.0.2 R.cache_0.15.0
#> [5] rlang_1.0.2 fastmap_1.1.0 fansi_1.0.3 stringr_1.4.0
#> [9] styler_1.7.0 highr_0.9 tools_4.1.3 xfun_0.30
#> [13] R.oo_1.24.0 utf8_1.2.2 cli_3.2.0 withr_2.5.0
#> [17] htmltools_0.5.2 ellipsis_0.3.2 yaml_2.3.5 digest_0.6.29
#> [21] tibble_3.1.6 lifecycle_1.0.1 crayon_1.5.1 purrr_0.3.4
#> [25] R.utils_2.11.0 vctrs_0.3.8 fs_1.5.2 glue_1.6.2
#> [29] evaluate_0.15 rmarkdown_2.13 reprex_2.0.1 stringi_1.7.6
#> [33] compiler_4.1.3 pillar_1.7.0 R.methodsS3_1.8.1 pkgconfig_2.0.3 Created on 2022-08-23 by the reprex package (v2.0.1) |
@berg-michael IMO your first example of (shortened it)
should never work because of different types. Ofc we could check if all values of a column are |
This may be an ignorant question, but doesn't your example work for non-date classes? I can run something like library(data.table)
x <- data.table(a = 1, b = "2016-01-01")
y <- data.table(a = 5, b = NA_integer_)
str(rbind(x, y))
#> Classes 'data.table' and 'data.frame': 2 obs. of 2 variables:
#> $ a: num 1 5
#> $ b: chr "2016-01-01" NA
#> - attr(*, ".internal.selfref")=<externalptr> It seems like rbind will coerce classes/types in most but not all situations, and merge with all = T can rely on that behavior when there are non-matching rows between two datasets. Though I don't fully understand why the call you gave doesn't work in either 1.14.2 and 1.14.3, yet merge works fine in these cases in 1.14.2 while sometimes breaking in 1.14.3. I believe it is related to allowing usenames = F when fill = T, as if I force usenames to T in 1.14.3 the merge works fine. I guess what you describe is really just #3911. |
I see, so apparently, we do type bumping for atomic types. If we allow different x = data.table(a = 1, b = as.IDate(16801))
y = data.table(a = 5, b = NA)
rbind(x, y)
#> a b
#> <num> <IDat>
#> 1: 1 2016-01-01
#> 2: 5 <NA>
rbind(y, x)
#> a b
#> <num> <int>
#> 1: 5 NA
#> 2: 1 16801 |
) * add regression fix * add tests from #5309 * added comment about NA rectangle * emphasize subtle part about attributes too --------- Co-authored-by: Michael Chirico <[email protected]>
* add fix #5309 * fix test numbering * add rbind for ITime * more tests * add merge tests * add AsIs #4934 * add news * news typo * add ignore.attr argument * fix news * change arguments of registered rbindlist * add attribute to usage * move nanotime tests * adjust test numbering * add test coverage * prohibit NA for ignore.att * move news * finish todo of #5857 * Update NEWS.md Co-authored-by: Michael Chirico <[email protected]> * update comment * update doc for ignore.attr * fix nit ignoreattr * fix test consistency * remove setnames * update asis test to use rbindlist * update test comments * update NEWS num * NEWS wording * more NEWS wording * template message for i18n * simplify condition (C boolean --> no NA to worry about) * && not & * correct error message --------- Co-authored-by: Michael Chirico <[email protected]> Co-authored-by: Michael Chirico <[email protected]>
Looks like rbind (and by extension merge(all = T)) doesn't like POSIXct type data in 1.14.3. To the extent this is a legit bug, I think its been introduced relatively recently (code running on an older version of 1.14.3 seemed to work just fine).
Under 1.14.2:
The text was updated successfully, but these errors were encountered: