Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename drop_na into remove_na in data_match() #556

Merged
merged 3 commits into from
Oct 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Type: Package
Package: datawizard
Title: Easy Data Wrangling and Statistical Transformations
Version: 0.13.0.2
Version: 0.13.0.5
Authors@R: c(
person("Indrajeet", "Patil", , "[email protected]", role = "aut",
comment = c(ORCID = "0000-0003-1995-6531")),
Expand Down
5 changes: 5 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# datawizard (development)

BREAKING CHANGES

* Argument `drop_na` in `data_match()` is deprecated now. Please use `remove_na`
instead.

CHANGES

* The `select` argument, which is available in different functions to select
Expand Down
2 changes: 1 addition & 1 deletion R/data_group.R
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ data_group <- function(data,
to = my_grid[i, , drop = FALSE],
match = "and",
return_indices = TRUE,
drop_na = FALSE
remove_na = FALSE
))
})
my_grid[[".rows"]] <- .rows
Expand Down
19 changes: 16 additions & 3 deletions R/data_match.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
#' @param return_indices Logical, if `FALSE`, return the vector of rows that
#' can be used to filter the original data frame. If `FALSE` (default),
#' returns directly the filtered data frame instead of the row indices.
#' @param drop_na Logical, if `TRUE`, missing values (`NA`s) are removed before
#' @param remove_na Logical, if `TRUE`, missing values (`NA`s) are removed before
#' filtering the data. This is the default behaviour, however, sometimes when
#' row indices are requested (i.e. `return_indices=TRUE`), it might be useful
#' to preserve `NA` values, so returned row indices match the row indices of
Expand All @@ -26,6 +26,7 @@
#' character vector (e.g. `c("x > 4", "y == 2")`) or a variable that contains
#' the string representation of a logical expression. These might be useful
#' when used in packages to avoid defining undefined global variables.
#' @param drop_na Deprecated, please use `remove_na` instead.
#'
#' @return A filtered data frame, or the row indices that match the specified
#' configuration.
Expand Down Expand Up @@ -100,12 +101,24 @@
#' data_filter(mtcars, fl)
#' @inherit data_rename seealso
#' @export
data_match <- function(x, to, match = "and", return_indices = FALSE, drop_na = TRUE, ...) {
data_match <- function(x,
to,
match = "and",
return_indices = FALSE,
remove_na = TRUE,
drop_na,

Check warning on line 109 in R/data_match.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/data_match.R,line=109,col=24,[function_argument_linter] Arguments without defaults should come before arguments with defaults. Consider setting the default to NULL and using is.null() instead of using missing()

Check warning on line 109 in R/data_match.R

View workflow job for this annotation

GitHub Actions / lint / lint

file=R/data_match.R,line=109,col=24,[function_argument_linter] Arguments without defaults should come before arguments with defaults. Consider setting the default to NULL and using is.null() instead of using missing()
...) {
if (!is.data.frame(to)) {
to <- as.data.frame(to)
}
original_x <- x

## TODO: remove deprecated argument later
if (!missing(drop_na)) {
insight::format_warning("Argument `drop_na` is deprecated. Please use `remove_na` instead.")
remove_na <- drop_na
}

# evaluate
match <- match.arg(tolower(match), c("and", "&", "&&", "or", "|", "||", "!", "not"))
match <- switch(match,
Expand Down Expand Up @@ -133,7 +146,7 @@
idx <- vector("numeric", length = 0L)
} else {
# remove missings before matching
if (isTRUE(drop_na)) {
if (isTRUE(remove_na)) {
x <- x[stats::complete.cases(x), , drop = FALSE]
}
idx <- seq_len(nrow(x))
Expand Down
14 changes: 12 additions & 2 deletions man/data_match.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions tests/testthat/test-data_match.R
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
match = "or",
return_indices = TRUE
))
x2 <- nrow(poorman::filter(efc, c172code == 1 | e16sex == 2))

Check warning on line 26 in tests/testthat/test-data_match.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=tests/testthat/test-data_match.R,line=26,col=9,[nrow_subset_linter] Use arithmetic to count the number of rows satisfying a condition, rather than fully subsetting the data.frame and counting the resulting rows. For example, replace nrow(subset(x, is_treatment)) with sum(x$is_treatment). NB: use na.rm = TRUE if `is_treatment` has missing values.
expect_identical(x1, x2)

# "AND" works
Expand All @@ -33,7 +33,7 @@
match = "and",
return_indices = TRUE
))
x2 <- nrow(poorman::filter(efc, c172code == 1, e16sex == 2))

Check warning on line 36 in tests/testthat/test-data_match.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=tests/testthat/test-data_match.R,line=36,col=9,[nrow_subset_linter] Use arithmetic to count the number of rows satisfying a condition, rather than fully subsetting the data.frame and counting the resulting rows. For example, replace nrow(subset(x, is_treatment)) with sum(x$is_treatment). NB: use na.rm = TRUE if `is_treatment` has missing values.
expect_identical(x1, x2)

# "NOT" works
Expand All @@ -43,7 +43,7 @@
match = "not",
return_indices = TRUE
))
x2 <- nrow(poorman::filter(efc, c172code != 1, e16sex != 2))

Check warning on line 46 in tests/testthat/test-data_match.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=tests/testthat/test-data_match.R,line=46,col=9,[nrow_subset_linter] Use arithmetic to count the number of rows satisfying a condition, rather than fully subsetting the data.frame and counting the resulting rows. For example, replace nrow(subset(x, is_treatment)) with sum(x$is_treatment). NB: use na.rm = TRUE if `is_treatment` has missing values.
expect_identical(x1, x2)

# remove NA
Expand All @@ -52,15 +52,15 @@
data.frame(c172code = 1, e16sex = 2),
match = "not",
return_indices = TRUE,
drop_na = FALSE
remove_na = FALSE
))
expect_identical(x1, 41L)
x1 <- length(data_match(
efc,
data.frame(c172code = 1, e16sex = 2),
match = "not",
return_indices = TRUE,
drop_na = TRUE
remove_na = TRUE
))
expect_identical(x1, 36L)
})
Expand Down Expand Up @@ -231,7 +231,7 @@
)

foo3 <- function(data) {
var <- "mpg >= 30"

Check warning on line 234 in tests/testthat/test-data_match.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=tests/testthat/test-data_match.R,line=234,col=5,[object_overwrite_linter] 'var' is an exported object from package 'stats'. Avoid re-using such symbols.
data_filter(data, var)
}
expect_identical(
Expand Down
Loading