Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore columns in sync_table_2 #88

Open
pbchase opened this issue Nov 28, 2022 · 0 comments
Open

Ignore columns in sync_table_2 #88

pbchase opened this issue Nov 28, 2022 · 0 comments

Comments

@pbchase
Copy link
Contributor

pbchase commented Nov 28, 2022

Ignore specified columns in update_records dataframe of sync_table_2 at

# TODO: detect "created/updated" and adjust values where appropriate

The problem stems from adding columns to a data frame to time-stamp record creation and update. The added columns are by definition not in the data source and they are completely novel on each run of the script. If on run N we get data A and on run N +1 we get data A + data B. We will rewrite all of A with a new timestamp.

I think the fix is to allow the caller to specify a vector of column names that should be ignored in the anti_join, which removes known data. This new parameter might be named columns_to_ignore, ignore_in_update, novel_columns, or some better name than these examples I am throwing up against the wall.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant