Adheres to Semantic Versioning and Keep a Changelog.
- Full support for Pandas 2
- zstd compression
- YAML format
- ZIP compression
- Support for Python < 3.9
- Support for Pandas < 1.5.2
- Support for Poetry < 1.4
MiscUtils.table_format
- Positional args to
to_
andread_
methods cli_help
example
- Use
import typeddfs
instead offrom typeddfs import TypedDfs
- Excel write methods no longer use
encoding
- Custom exceptions no longer have multiple superclasses
FrozeSet
now subclassesfrozenset
instead ofAbstractSet
- Use poetry-core and new Poetry options
- Enable virtualenv>20.0.33
- Dropped
pd.DataFrame.append
for Pandas>=2 (it was removed) - Added
pd.DataFrame.map
for Pandas>=2.1 - Replaced deprecated Pandas functions
Full Changelog: https://github.com/dmyersturnbull/typed-dfs/compare/v0.16.5...v0.17.0
No changes.
MiscUtils.delete_file
Checksums.delete_any
Checksums.generate_dirsum
- One less
.resolve
inChecksums
- Return a more specific exception for non-relative path
JsonUtils.preserve_inf
would error on some lists
Full Changelog: https://github.com/dmyersturnbull/typed-dfs/compare/v0.16.3...v0.16.4
- Can concatenate with
.of
to=
arg topretty_print
Utils.choose_table_format
Full Changelog: https://github.com/dmyersturnbull/typed-dfs/compare/v0.16.2...v0.16.3
EMPTY
class attributes on froze collectionsis_empty
andlength
on froze collectionsSortUtils.core_natsort_flags
read_file
andwrite_file
now call.resolve
on the paths- Renamed ExampleDataframes to ExampleDfs and LazyDataframe to LazyDf
- Returned tuples are now mainly namedtuples
- Compat with natsort 8 (now required)
Full Changelog: https://github.com/dmyersturnbull/typed-dfs/compare/v0.16.1...v0.16.2
datasets.py
FileFormat.split
and related methodsAbsDf.read_url
read_file
uses IOTyping whenattrs=None
Full Changelog: https://github.com/dmyersturnbull/typed-dfs/compare/v0.16.0...v0.16.1
- Support for custom formats and per-suffix kwargs
- Support for json encoding args for .attrs
Checksums
- Simplified
.suffixes
(breaking change to an uncommon function)
- Duplicate IO methods
- Checksum path resolution
Full Changelog: https://github.com/dmyersturnbull/typed-dfs/compare/v0.15.0...v0.16.0
- New functions in
Checksums
- JSON encoding functions in
JsonUtils
- Checksums now holds the encoding (breaking)
use_filename=False
by default (wasNone
)- attrs now written with
orjson
and custom default - orjson is now required
- internal refactoring
attrs=True
less brittle- Pandas >= 1.3 is explicitly required; this was already effectively true
Full Changelog: https://github.com/dmyersturnbull/typed-dfs/compare/v0.14.2...v0.15.0
- Top-level imports
- Moved some docs to readthedocs
Full Changelog: https://github.com/dmyersturnbull/typed-dfs/compare/v0.14.3...v0.14.4
- Some new methods to
Checksums
Checksums
functions can now accept strings
Full Changelog: https://github.com/dmyersturnbull/typed-dfs/compare/v0.16.2...v0.14.3
- Two minor bugs
- Greatly improved
cli_help
CoreDf.set_attrs
- Reading and writing dataset metadata (attrs)
- Improved
cli_help
- Arguments in
write_file
andread_file
are now keyword-only - Small parts of
Checksums
write_file
checks for existing hashes and write permissions before attempting to write- extras "all" and "main"
- docstring rst issues, esp. broken links
- metadata always passed to custom exceptions
- Some uncommon Excel suffixes
cli_help
FileFormat.is_recommended
FileFormat.matches
recommended_only()
to builders
CompressionFormat.strip
- Deprecation warning in test
- matrix row and column names now always typed as str
- Preview support for .properties, INI, and TOML
- Top-level imports
typed
,untyped
, etc. - Q & A in the readme
remap_suffixes()
renamed tosuffix()
.subclass()
now supports multiple inheritance
to_parquet()
doesn't change short to int
CoreDf.strip_control_chars
Utils.exact_natsort_alg
,Utils.guess_natsort_alg
, andUtils.all_natsort_flags
- Functions from
pandas.api.types
toUtils
DfSupport.reload()
DfTyping
is now genericFrozeSet
andFrozeDict
are now ordered- Moved checksum utils to
checksums.py
- Moved
DfSupport
to_format_support.py
sort_natural
now infers the best algorithm from the data type, by defaultdrop_cols
can now accept *args- Split
parse_hash_file
intoparse_hash_file_resolved
andparse_hash_file_generic
regex
is now a dependency- Hashing options in
write_file
- Some
Utils
andFileFormat
params are keyword-only
Utils.verify_any_hash
- Positional args from
ffill
andbfill
- Bugs in
exact_natsort_alg
- Small bugs in
FrozeDict
andFrozeSet
- Dataclass conversion
- Utils for freezing types
from_records
now callsconvert
- Hash utils in
Utils
file_hash
,dir_hash
, andmkdirs
towrite_file
to_rst
- Moved DF classmethods to
DfTyping
- DF operators now attempt to keep typing
- All MatrixDfs are now strict
- MatrixDF row and column names now must always be "row" and "column"
.newline
in builder
BaseDf.of
as an alias toBaseDf.convert
empty_df
methodsindex_series_name
andcolumn_series_name
FileFormat.strip_compression
,FileFormat.compression_from_path
, and related
- Index series and column series names are set to None by default in
TypedDf
- String types are now required for column/index names in
MatrixDf
MatrixDf.strict
is True by default
Utils.table_formats
- Added tests for
symmetrize
and a few others
- Matrix DFs
- Pickle support
Utils
AbsDf.text_encoding
- Extras
excel
andxlsb
AbsDf.read_html
- To
TypedDfBuilder
:remap_suffixes
,encoding
,newlines
,subclass
, andadd_methods
TypedDf.is_valid
no longer tries to convert; it just uses the DataFrame as-is- Text encoding is UTF-8 by default, dictated by
AbsDf.text_encoding
extra_requirements
renamed toverifications
fastparquet
no longer used inparquet
extraCoreDf.transpose
now overridden and re-types.read_excel
uses openpyxl by default for XLSX-like, XLS, and ODS-like (in contrast to Pandas)post_processing
,verifications
, and related functions were moved up toBaseDf
- Some
AbsDf
delegates toDataFrame
now just take*args
and**kwargs
for simplicity. tabulate
andwcwidth
are now required dependencies.- Optional dependency that are not used directly now have >= version ranges
- You can now write empty DataFrames to Feather.
to_excel
is much less likely to error for ODF, ODS, ODT, and XLS.- Keyword arguments added via
write_kwargs
andread_kwargs
no longer clash between CSV and TSV. - Possible bugs reading and writing to fwf and flexwf (use
disable_numparse
)
nl
andbom
options. See.newline
and.encoding
inTypedDfBuilder
for alternatives.- Some deprecated options.
- Support for
to_xml
andread_xml
can_read
andcan_write
onBaseDf
to get supported file formats- Write (and read) to "flex" fixed-width; currently, this is only used for ".flexwf" as a preview
pretty_print
, which delegates to tabulate- Optional post-processing method (
TypedDf.post_process
) known_column_names
,known_index_names
, andknown_names
- Methods to set default read_file/to_file args
- All args from
read_file
andto_file
comment
fromto_lines
; it was too confusing because no other write functions had one
dtype
values inTypedDfBuilder
are now used; specifically,TypedDf.convert
callspd.Series.astype
with them.- Overrode
assign
to handle indices - Split some functionality of
AbsDf
into a superclass_CoreDf
- Bumped pyarrow to 4.0
- Various functions return more specific error types
- Deprecated
TypedDfBuilder.condition
(renamed toverify
) - Passing
inplace=True
where not supported now raises an error instead of warning - All
write_file
serialization now requires column names to be str for consistency - Empty DataFrames are read via
BaseDf.read_csv
, etc. without issue (pd.read_csv
normally fails)
to_lines
andread_lines
are fully inverses- Read/write are inverses for untyped DFs for all formats
- Deleted .dockerignore and codemeta.json
check
workflow no longer errors on push- Better read/write tests; enabled Parquet-format tests
vanilla_reset
- Unused Sphinx/readthedocs files
- Not passing kwargs to
UntypedDf.to_csv
- Simplified some read/write code
- Read/write wrappers for Feather, Parquet, and JSON
- Added general functions
read_file
andwrite_file
TypeDfs.wrap
andFinalDf
to_csv
was not passing alongargs
andkwargs
- Slightly better build config
- Made
tables
an optional dependency; usetypeddfs[hdf5]
natsort
is no longer pinned to version 7; it's now>=7
. Added a note in the readme that this just requires some caution.
- Slight improvement to build and metadata
- support for Python 3.7
- Bumped Pandas to 1.2
- Updated build
require_full
argument- support for Pandas <1.1
convert
now keeps non-reserved indices in the index as long asmore_indices_allowed
is false- Moved builder to a separate module
- Changed or added type annotations using
__qualname__
- Moved some basic functions from
AbsFrame
to its superclassPrettyFrame
- A method on
BaseFrame
calledsuch_that
to do type-retaining slicing
- A bug in
only
- A bug in checking symmetry
- Dropped unnecessary imports
- Clarified that
detype
is needed for functions likeapplymap
if requirements will fail the returned value - Improved test coverage
- Added docstrings
- Builder and static factory for new classes
- Symmetry and custom conditions
- Renamed most classes
- Renamed
to_vanilla
tovanilla
, dropping the latter - Split code into several files
- Main code.