Releases: easystats/datawizard
datawizard 1.0.0
BREAKING CHANGES AND DEPRECATIONS
-
datawizard now requires R >= 4.0 (#515).
-
Argument
drop_na
indata_match()
is deprecated now. Please use
remove_na
instead (#556). -
In
data_rename()
(#567):- argument
pattern
is deprecated. Useselect
instead. - argument
safe
is deprecated. The function now errors whenselect
contains unknown column names. - when
replacement
isNULL
, an error is now thrown (previously, column
indices were used as new names). - if
select
(previouslypattern
) is a named vector, then all elements
must be named, e.g.c(length = "Sepal.Length", "Sepal.Width")
errors.
- argument
-
Order of arguments
by
andprobability_weights
inrescale_weights()
has
changed, because formethod = "kish"
, theby
argument is optional (#575). -
The name of the rescaled weights variables in
rescale_weights()
have been
renamed.pweights_a
andpweights_b
are now namedrescaled_weights_a
andrescaled_weights_b
(#575). -
print()
methods fordata_tabulate()
with multiple sub-tables (i.e. when
length ofby
was > 1) were revised. Now, an integrated table instead of
multiple tables is returned. Furthermore,print_html()
did not work, which
was also fixed now (#577). -
demean()
(anddegroup()
) gets anappend
argument that defaults toTRUE
,
to append the centered variables to the original data frame, instead of
returning the de- and group-meaned variables only. Useappend = FALSE
to
for the previous default behaviour (i.e. only returning the newly created
variables) (#579).
CHANGES
-
rescale_weights()
gets amethod
argument, to choose method to rescale
weights. Options are"carle"
(the default) and"kish"
(#575). -
The
select
argument, which is available in different functions to select
variables, can now also be a character vector with quoted variable names,
including a colon to indicate a range of several variables (e.g."cyl:gear"
)
(#551). -
New function
row_sums()
, to calculate row sums (optionally with minimum
amount of valid values), as complement torow_means()
(#552). -
New function
row_count()
, to count specific values row-wise (#553). -
data_read()
no longer shows warning about forthcoming breaking changes
in upstream packages when reading.RData
files (#557). -
data_modify()
now recognizesn()
, for example to create an index for data
groups with1:n()
(#535). -
The
replacement
argument indata_rename()
now supports glue-styled
tokens (#563). -
data_summary()
also accepts the results ofbayestestR::ci()
as summary
function (#483). -
ranktransform()
has a new argumentzeros
to determine how zeros should be
handled whensign = TRUE
(#573).
BUG FIXES
datawizard 0.13.0
BREAKING CHANGES
-
data_rename()
now errors when thereplacement
argument containsNA
values
or empty strings (#539). -
Removed deprecated functions
get_columns()
,data_find()
,format_text()
(#546). -
Removed deprecated arguments
group
andna.rm
in multiple functions. Useby
andremove_na
instead (#546). -
The default value for the argument
dummy_factors
into_numeric()
has
changed fromTRUE
toFALSE
(#544).
CHANGES
-
The
pattern
argument indata_rename()
can also be a named vector. In this
case, names are used as values for thereplacement
argument (i.e.pattern
can be a character vector using<new name> = "<old name>"
). -
categorize()
gains a newbreaks
argument, to decide whether breaks are
inclusive or exclusive (#548). -
The
labels
argument incategorize()
gets two new options,"range"
and
"observed"
, to use the range of categorized values as labels (i.e. factor
levels) (#548). -
Minor additions to
reshape_ci()
to work with forthcoming changes in the
{bayestestR}
package.
datawizard 0.12.3
CHANGES
-
demean()
(anddegroup()
) now also work for nested designs, if argument
nested = TRUE
andby
specifies more than one variable (#533). -
Vignettes are no longer provided in the package, they are now only available
on the website. There is only one "Overview" vignette available in the package,
it contains links to the other vignettes on the website. This is because there
are CRAN errors occurring when building vignettes on macOS and we couldn't
determine the cause after multiple patch releases (#534).
datawizard 0.12.2
- Remove
htmltools
fromSuggests
in an attempt of fixing an error in CRAN
checks due to failures to build a vignette (#528).
datawizard 0.12.0
BREAKING CHANGES
-
The argument
include_na
indata_tabulate()
anddata_summary()
has been
renamed intoremove_na
. Consequently, to mimic former behaviour,FALSE
and
TRUE
need to be switched (i.e.remove_na = TRUE
is equivalent to the former
include_na = FALSE
). -
Class names for objects returned by
data_tabulate()
have been changed to
datawizard_table
anddatawizard_crosstable
(resp. the plural forms,
*_tables
), to provide a clearer and more consistent naming scheme.
CHANGES
-
data_select()
can directly rename selected variables when a named vector
is provided inselect
, e.g.data_select(mtcars, c(new1 = "mpg", new2 = "cyl"))
. -
data_tabulate()
gains anas.data.frame()
method, to return the frequency
table as a data frame. The structure of the returned object is a nested data
frame, where the first column contains name of the variable for which
frequencies were calculated, and the second column contains the frequency table. -
demean()
(anddegroup()
) now also work for cross-classified designs, or
more generally, for data with multiple grouping or cluster variables (i.e.
by
can now specify more than one variable).
datawizard 0.11.0
BREAKING CHANGES
-
Arguments named
group
orgroup_by
are deprecated and will be removed
in a future release. Please useby
instead. This affects the following
functions in datawizard (#502).data_partition()
demean()
anddegroup()
means_by_group()
rescale_weights()
-
Following aliases are deprecated and will be removed in a future release (#504):
get_columns()
, usedata_select()
instead.data_find()
andfind_columns()
, useextract_column_names()
instead.format_text()
, usetext_format()
instead.
CHANGES
-
recode_into()
is more relaxed regarding checking the type ofNA
values.
If you recode into a numeric variable, and one of the recode values isNA
,
you no longer need to useNA_real_
for numericNA
values. -
Improved documentation for some functions.
BUG FIXES
data_to_long()
did not work for data frame where columns had attributes
(like labelled data).
datawizard 0.10.0
BREAKING CHANGES
-
The following arguments were deprecated in 0.5.0 and are now removed:
- in
data_to_wide()
:colnames_from
,rows_from
,sep
- in
data_to_long()
:colnames_to
- in
data_partition()
:training_proportion
- in
NEW FUNCTIONS
-
data_summary()
, to compute summary statistics of (grouped) data frames. -
data_replicate()
, to expand a data frame by replicating rows based on another
variable that contains the counts of replications per row.
CHANGES
-
data_modify()
gets three new arguments,.at
,.if
and.modify
, to modify
variables at specific positions or based on logical conditions. -
data_tabulate()
was revised and gets several new arguments: aweights
argument, to compute weighted frequency tables.include_na
allows to include
or omit missing values from the table. Furthermore, aby
argument was added,
to compute crosstables (#479, #481).
0.9.1
datawizard 0.9.1
CHANGES
-
rescale()
gainsmultiply
andadd
arguments, to expand ranges by a given
factor or value. -
to_factor()
andto_numeric()
now support classhaven_labelled
.
BUG FIXES
-
to_numeric()
now correctly deals with inversed factor levels when
preserve_levels = TRUE
. -
to_numeric()
inversed order of value labels whendummy_factors = FALSE
. -
convert_to_na()
now preserves attributes for factors whendrop_levels = TRUE
.
datawizard 0.9.0
NEW FUNCTIONS
-
row_means()
, to compute row means, optionally only for the rows with at
leastmin_valid
non-missing values. -
contr.deviation()
for sum-deviation contrast coding of factors. -
means_by_group()
, to compute mean values of variables, grouped by levels
of specified factors. -
data_seek()
, to seek for variables in a data frame, based on their
column names, variables labels, value labels or factor levels. Searching for
labels only works for "labelled" data, i.e. when variables have alabel
or
labels
attribute.
CHANGES
-
recode_into()
gains anoverwrite
argument to skip overwriting already
recoded cases when multiple recode patterns apply to the same case. -
recode_into()
gains anpreserve_na
argument to preserveNA
values
when recoding. -
data_read()
now passes theencoding
argument todata.table::fread()
.
This allows to read files with non-ASCII characters. -
datawizard
moves from the GPL-3 license to the MIT license. -
unnormalize()
andunstandardize()
now work with grouped data (#415). -
unnormalize()
now errors instead of emitting a warning if it doesn't have the
necessary info (#415).
BUG FIXES
-
Fixed issue in
labels_to_levels()
when values of labels were not in sorted
order and values were not sequentially numbered. -
Fixed issues in
data_write()
when writing labelled data into SPSS format
and vectors were of different type as value labels. -
Fixed issue in
recode_into()
with probably wrong case number printed in the
warning when several recode patterns match to one case. -
Fixed issue in
recode_into()
when original data containedNA
values and
NA
was not included in the recode pattern. -
Fixed issue in
data_filter()
where functions containing a=
(e.g. when
naming arguments, likegrepl(pattern, x = a)
) were mistakenly seen as
faulty syntax. -
Fixed issue in
empty_column()
for strings with invalid multibyte strings.
For such data frames or files,empty_column()
ordata_read()
no longer
fails.
datawizard 0.8.0
BREAKING CHANGES
-
The following re-exported functions from
{insight}
have now been removed:
object_has_names()
,object_has_rownames()
,is_empty_object()
,
compact_list()
,compact_character()
. -
Argument
na.rm
was renamed toremove_na
throughout{datawizard}
functions.
na.rm
is kept for backward compatibility, but will be deprecated and later
removed in future updates. -
The way expressions are defined in
data_filter()
was revised. Thefilter
argument was replaced by...
, allowing to separate multiple expression with
a comma (which are then combined with&
). Furthermore, expressions can now also be
defined as strings, or be provided as character vectors, to allow string-friendly
programming.
CHANGES
-
Weighted-functions (
weighted_sd()
,weighted_mean()
, ...) gain aremove_na
argument, to remove or keep missing and infinite values. By default,
remove_na = TRUE
, i.e. missing and infinite values are removed by default. -
reverse_scale()
,normalize()
andrescale()
gain anappend
argument
(similar to other data frame methods of transformation functions), to append
recoded variables to the input data frame instead of overwriting existing
variables.
NEW FUNCTIONS
-
rowid_as_column()
to complementrownames_as_column()
(and to mimic
tibble::rowid_to_column()
). Note that its behavior is different from
tibble::rowid_to_column()
for grouped data. See the Details section in the
docs. -
data_unite()
, to merge values of multiple variables into one new variable. -
data_separate()
, as counterpart todata_unite()
, to separate a single
variable into multiple new variables. -
data_modify()
, to create new variables, or modify or remove existing
variables in a data frame.
MINOR CHANGES
-
to_numeric()
for variables of typeDate
,POSIXct
andPOSIXlt
now
includes the class name in the warning message. -
Added a
print()
method forcenter()
,standardize()
,normalize()
and
rescale()
.
BUG FIXES
-
standardize_parameters()
now works when the package namespace is in the model
formula (#401). -
data_merge()
no longer yields a warning fortibbles
whenjoin = "bind"
. -
center()
andstandardize()
did not work for grouped data frames (of class
grouped_df
) whenforce = TRUE
. -
The
data.frame
method ofdescribe_distribution()
returnsNULL
instead of
an error if no valid variable were passed (for example a factor variable with
include_factors = FALSE
) (#421).