⚡️ v2.8.2 Add batches to `verify pipes`, allow for multiple instances from WebAPI, and memory improvements.
v2.8.0 – v2.8.2
-
Add batches to
Pipe.verify()
.
Verification syncs now run in sequential batches so that they may be interrupted and resumed. SeePipe.get_chunk_bounds_batches()
for more information:from datetime import timedelta import meerschaum as mrsm pipe = mrsm.Pipe('demo', 'get_chunk_bounds', instance='sql:local') bounds = pipe.get_chunk_bounds( chunk_interval=timedelta(hours=10), begin='2025-01-10', end='2025-01-15', bounded=True, ) batches = pipe.get_chunk_bounds_batches(bounds, workers=4) mrsm.pprint( [ tuple( (str(bounds[0]), str(bounds[1])) for bounds in batch ) for batch in batches ] ) # [ # ( # ('2025-01-10 00:00:00+00:00', '2025-01-10 10:00:00+00:00'), # ('2025-01-10 10:00:00+00:00', '2025-01-10 20:00:00+00:00'), # ('2025-01-10 20:00:00+00:00', '2025-01-11 06:00:00+00:00'), # ('2025-01-11 06:00:00+00:00', '2025-01-11 16:00:00+00:00') # ), # ( # ('2025-01-11 16:00:00+00:00', '2025-01-12 02:00:00+00:00'), # ('2025-01-12 02:00:00+00:00', '2025-01-12 12:00:00+00:00'), # ('2025-01-12 12:00:00+00:00', '2025-01-12 22:00:00+00:00'), # ('2025-01-12 22:00:00+00:00', '2025-01-13 08:00:00+00:00') # ), # ( # ('2025-01-13 08:00:00+00:00', '2025-01-13 18:00:00+00:00'), # ('2025-01-13 18:00:00+00:00', '2025-01-14 04:00:00+00:00'), # ('2025-01-14 04:00:00+00:00', '2025-01-14 14:00:00+00:00'), # ('2025-01-14 14:00:00+00:00', '2025-01-15 00:00:00+00:00') # ) # ]
-
Add
--skip-chunks-with-greater-rowcounts
toverify pipes
.
The flag--skip-chunks-with-greater-rowcounts
will compare a chunk's rowcount with the rowcount of the remote table and skip if the chunk is greater than or equal to the remote count. This is only applicable for connectors which implementremote=True
support forget_sync_time()
. -
Add
verify rowcounts
.
The actionverify rowcounts
(same as passing--check-rowcounts-only
toverify pipes
) will compare row-counts for a pipe's chunks against remote rowcounts. This is only applicable for connectors which implementget_pipe_rowcount()
with support forremote=True
. -
Add
remote
topipe.get_sync_time()
.
For pipes which support it (i.e. theSQLConnector
), the optionremote
is intended to return the sync time of a pipe's fetch definition, like the optionremote
inPipe.get_rowcount()
. -
Allow for the Web API to serve pipes from multiple instances.
You can disable this behavior by settingsystem:api:permissions:instances:allow_multiple_instances
tofalse
. You may also explicitly allow which instances may be accessed by the WebAPI by setting the listsystem:api:permissions:instances:allowed_instance_keys
(defaults to["*"]
). -
Fix memory leak for retrying failed chunks.
Failed chunks were kept in memory and retried later. In resource-intensive syncs with large chunks and high failures, this would result in large objects not being freed and hogging memory. This situation has been fixed. -
Add negation to job actions.
Prefix a job name with an underscore to select all other jobs. This is useful for filtering out noise forshow logs
. -
Add
Pipe.parent
.
As a quality-of-life improvement, the attributePipe.parent
will return the first member ofPipe.parents
(if available). -
Use the current instance for new tabs in the Webterm.
Clicking "New Tab" will open a newtmux
window using the currently selected instance on the Web Console. -
Other webterm quality-of-life improvements.
Added a size toggle button to allow for the webterm to take the entire page. -
Additional refactoring work.
The API endpoints code has been cleaned up. -
Added system configurations.
New options have been added to thesystem
configuration, such asmax_response_row_limit
,allow_multiple_instances
,allowed_instance_keys
.