Skip to content

Releases: pathwaycom/pathway

v0.7.1

17 Nov 12:57
Compare
Choose a tag to compare

Added

  • Experimental Google Drive input connector.
  • Stateful deduplication function (pw.stateful.deduplicate) allowing alerting on significant changes.
  • The ability to split data into batches in pw.debug.table_from_markdown and pw.debug.table_from_pandas.

v0.7.0

16 Nov 11:34
Compare
Choose a tag to compare

Added

  • class Behavior, a superclass of all behavior classes.
  • class ExactlyOnceBehavior indicating we want to create a CommonBehavior that results in each window producing exactly one output (shifted in time by an optional shift parameter).
  • function exactly_once_behavior creating an instance of ExactlyOnceBehavior.

Changed

  • BREAKING: WindowBehavior is now called CommonBehavior, as it can be also used with interval joins.
  • BREAKING: window_behavior is now called common_behavior, as it can be also used with interval joins.
  • Deprecating parameter keep_queries in pw.io.http.rest_connector. Now delete_completed_queries with an opposite meaning should be used instead. The default is still delete_completed_queries=True (equivalent to keep_queries=False) but it will soon be required to be set explicitly.

v0.6.0

10 Nov 11:09
Compare
Choose a tag to compare

Added

  • A flag with_metadata for the filesystem-based connectors to attach the source file metadata to the table entries.
  • Methods pw.debug.table_from_list_of_batches and pw.debug.table_from_list_of_batches_by_workers for creating tables with defined data being inserted over time.

Changed

  • BREAKING: pw.debug.table_from_pandas and pw.debug.table_from_markdown now will create tables in the streaming mode, instead of static, if given table definition contains _time column.
  • BREAKING: Renamed the parameter keep_queries in pw.io.http.rest_connector to delete_queries with the opposite meaning. It changes the default behavior - it was keep_queries=False, now it is delete_queries=False.

v0.5.3

27 Oct 16:22
Compare
Choose a tag to compare

Added

  • A method get_nearest_items_asof_now in KNNIndex that allows to get nearest neighbors without updating old queries in the future.
  • A method asof_now_join in Table to join rows from left side of the join with right side of the join at their processing time. Past rows from left side are not used when new data appears on the right side.

v0.5.2

19 Oct 06:16
Compare
Choose a tag to compare

Added

  • interval_join now supports forgetting old entries. The configuration can be passed using behavior parameter of interval_join method.
  • Decorator @table_transformer for marking that functions take Tables as arguments.
  • Namespace for all columns Table.C.*.
  • Output connectors now provide logs about the number of entries written and time taken.
  • Filesystem connectors now support reading whole files as rows.

v0.5.1

04 Oct 19:44
Compare
Choose a tag to compare

Fixed

  • select operates only on consistent states.

v0.5.0

04 Oct 05:15
Compare
Choose a tag to compare

Added

  • Schema method typehints that returns dict of mypy-compatible typehints.
  • Support for JSON parsing from CSV sources.
  • restrict method in Table to restrict table universe to the universe of the other table.
  • Better support for postgresql types in the output connector.

Changed

  • BREAKING: tuple reducer used after intervals_over window now sorts values by time.
  • BREAKING: expressions used in select, filter, flatten, with_columns, with_id, with_id_from have to have the same universe as the table. Earlier it was possible to use an expression from a superset of a table universe. To use expressions from wider universes, one can use restrict on the expression source table.
  • BREAKING: pw.universes.promise_are_equal(t1, t2) no longer allows to use references from t1 and t2 in a single expression. To change the universe of a table, use with_universe_of.
  • BREAKING: ix and ix_ref are temporarily broken inside joins (both temporal and ordinary).
  • select, filter, concat keep columns as a single stream. The work for other operators is ongoing.
  • BREAKING: renamed Table method dtypes to typehints. It now returns a dict of mypy-compatible typehints.
  • BREAKING: Schema.__getitem__ returns a data class ColumnSchema containing all related information on particular column.

Fixed

  • Optional types other than string correctly output to PostgreSQL.

v0.4.1

25 Sep 07:12
Compare
Choose a tag to compare

Added

  • Support for messages compressed with zstd in the Kafka connector.

v0.4.0

21 Sep 12:31
Compare
Choose a tag to compare

Added

  • Support for JSON data format, including pw.Json type.
  • Methods as_int(), as_float(), as_str(), as_bool() to convert values from Json.

Changed

  • Method get() and [] to support accessing elements in Jsons.
  • Function pw.assert_table_has_schema for writing asserts checking, whether given table has the same schema as the one that is given as an argument.
  • BREAKING: ix and ix_ref operations are now standalone transformations of pw.Table into pw.Table. Most of the usages remain the same, but sometimes user needs to provide a context (when e.g. using them inside join or groupby operations). ix and ix_ref are temporarily broken inside temporal joins.

Fixed

  • Fixed a bug where new-style optional types (e.g. int | None) were translated to Any dtype.

v0.3.4

18 Sep 10:18
Compare
Choose a tag to compare

Fixed

  • Incompatible beartype version is now excluded from dependencies.