[HUDI-8892] Introduce projection push down for payload mode #12684
+59
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In many wide table scenarios, there may be thousands of columns in a table, and there are multiple tasks processing different columns. When reading, the downstream only cares about some dimension columns. But now the payload mode does not support columns trimming, resulting in great performance regression when perform snapshot read on file slices with log files exist. This is because all columns of the base file are read, even though most of the columns are not needed by the user
Change Logs
Describe context and summary for this change. Highlight if any code was copied.
Impact
Improve mor read performance
Risk level (write none, low medium or high below)
medium
Documentation Update
none
Contributor's checklist