Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preparing v4.0.1 #684

Closed
richox opened this issue Dec 6, 2024 · 0 comments · Fixed by #690
Closed

Preparing v4.0.1 #684

richox opened this issue Dec 6, 2024 · 0 comments · Fixed by #690

Comments

@richox
Copy link
Collaborator

richox commented Dec 6, 2024

planning to release v4.0.1:

New Feature

  • Initial supports to ORC input file format.
  • Initial supports to RSS framework and Apache Celeborn shuffle service.

Improvement

  • Optimize AggExec by supporting Implement columnar-based aggregation.
  • Use custom implemented hashmap implement for aggregation.
  • Supports specialized count(0).
  • Optimize bloom filter by reusing same bloom filter in the same executor.
  • Optimize bloom filter by supporting shrinking.
  • Optimize reading parquet files by supporting parallel reading.
  • Improve spill file deletion logics.

Bug fixes

  • Fix file not found for path with url encoded character.
  • Fix Hashaggregate convert job throwing ScalaReflectionException.
  • Fix pruning error while reading parquet files with multiple row groups.
  • Fix incorrect number of tasks due to missing shuffleOrigin.
  • Fix record batch creating error when hash joining with empty input.

Other

  • Upgrade datafusion/arrow dependency to v42/v53.
  • Replace gxhash with foldhash for better compatibility on some hardwares.
  • Other minor improvement & fixes.

PRs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant