Preparing v4.0.1 #684

richox · 2024-12-06T11:57:12Z

planning to release v4.0.1:

New Feature

Initial supports to ORC input file format.
Initial supports to RSS framework and Apache Celeborn shuffle service.

Improvement

Optimize AggExec by supporting Implement columnar-based aggregation.
Use custom implemented hashmap implement for aggregation.
Supports specialized count(0).
Optimize bloom filter by reusing same bloom filter in the same executor.
Optimize bloom filter by supporting shrinking.
Optimize reading parquet files by supporting parallel reading.
Improve spill file deletion logics.

Bug fixes

Fix file not found for path with url encoded character.
Fix Hashaggregate convert job throwing ScalaReflectionException.
Fix pruning error while reading parquet files with multiple row groups.
Fix incorrect number of tasks due to missing shuffleOrigin.
Fix record batch creating error when hash joining with empty input.

Other

Upgrade datafusion/arrow dependency to v42/v53.
Replace gxhash with foldhash for better compatibility on some hardwares.
Other minor improvement & fixes.

PRs

AggExec: implement columnar accumulator states. by @richox in AggExec: implement columnar accumulator states. #646
Bump bigdecimal from 0.4.5 to 0.4.6 by @dependabot in Bump bigdecimal from 0.4.5 to 0.4.6 #638
Bump bytes from 1.7.2 to 1.8.0 by @dependabot in Bump bytes from 1.7.2 to 1.8.0 #625
Bump bytes from 1.8.0 to 1.9.0 by @dependabot in Bump bytes from 1.8.0 to 1.9.0 #671
Bump object_store from 0.11.0 to 0.11.1 by @dependabot in Bump object_store from 0.11.0 to 0.11.1 #622
Bump sonic-rs from 0.3.13 to 0.3.14 by @dependabot in Bump sonic-rs from 0.3.13 to 0.3.14 #623
Bump sonic-rs from 0.3.14 to 0.3.16 by @dependabot in Bump sonic-rs from 0.3.14 to 0.3.16 #647
Bump tempfile from 3.13.0 to 3.14.0 by @dependabot in Bump tempfile from 3.13.0 to 3.14.0 #641
Bump tokio from 1.40.0 to 1.41.0 by @dependabot in Bump tokio from 1.40.0 to 1.41.0 #629
Bump tokio from 1.41.0 to 1.41.1 by @dependabot in Bump tokio from 1.41.0 to 1.41.1 #642
Bump tokio from 1.41.0 to 1.41.1 by @dependabot in Bump tokio from 1.41.0 to 1.41.1 #676
Bump uuid from 1.10.0 to 1.11.0 by @dependabot in Bump uuid from 1.10.0 to 1.11.0 #618
Create RecordBatch with num_rows option to avoid bhj error caused by empty output_schema by @wForget in Create RecordBatch with num_rows option to avoid bhj error caused by empty output_schema #683
Fix build on windows by @wForget in Fix build on windows #666
Fix file not found for path with url encoded character by @wForget in Fix file not found for path with url encoded character #679
Followup to Introduce base blaze sql test suite #674, add -r for rm by @wForget in Followup to #674, add -r for rm #681
Introduce base blaze sql test suite by @wForget in Introduce base blaze sql test suite #674
[BLAZE-287][FOLLOWUP] Use JavaUtils#newConcurrentHashMap to speed up ConcurrentHashMap#computeIfAbsent by @SteNicholas in [BLAZE-287][FOLLOWUP] Use JavaUtils#newConcurrentHashMap to speed up ConcurrentHashMap#computeIfAbsent #615
[BLAZE-573][FOLLOWUP] Bump Spark from 3.4.3 to 3.4.4 by @SteNicholas in [BLAZE-573][FOLLOWUP] Bump Spark from 3.4.3 to 3.4.4 #640
[BLAZE-627] Make ORC and Parquet format detection more generic by @dixingxing0 in [BLAZE-627] Make ORC and Parquet format detection more generic #628
[BLAZE-664] Bump Celeborn version from 0.5.1 to 0.5.2 by @SteNicholas in [BLAZE-664] Bump Celeborn version from 0.5.1 to 0.5.2 #665
[MINOR] Avoid NPE when native lib is not found by @wForget in [MINOR] Avoid NPE when native lib is not found #668
add new blaze logo by @richox in add new blaze logo #633
chore: Make spotless plugin happy by @zuston in chore: Make spotless plugin happy #653
code refactoring by @richox in code refactoring #658
code refactoring by @richox in code refactoring #677
doc: update tpc-h benchmark result by @richox in doc: update tpc-h benchmark result #614
fix Hashaggregate convert job throw ScalaReflectionException by @leizhang5s in fix Hashaggregate convert job throw ScalaReflectionException #637
fix pruning error while reading parquet files with multiple row groups by @richox in fix pruning error while reading parquet files with multiple row groups #616
fix running error for Spark 3.2.0 and 3.2.1 by @XorSum in fix running error for Spark 3.2.0 and 3.2.1 #602
fix(shuffle): Progagate shuffle origin to native exchange exec to make AQE rebalance valid by @zuston in fix(shuffle): Progagate shuffle origin to native exchange exec to make AQE rebalance valid #663
fix(spill): Delete spill file when dropping for rust FileSpill by @zuston in fix(spill): Delete spill file when dropping for rust FileSpill #660
fix(spill): Explicitly delete spill file for FileBasedSpillBuf after release by @zuston in fix(spill): Explicitly delete spill file for FileBasedSpillBuf after release #654
improve NativeOrcScan by @richox in improve NativeOrcScan #631
improve memory management by @richox in improve memory management #621
improvement: Add numOfPartitions metrics for exchange exec to align with vanilla spark by @zuston in improvement: Add numOfPartitions metrics for exchange exec to align with vanilla spark #669
optimize bloom filter by @richox in optimize bloom filter #620
parquet reading improvements by @richox in parquet reading improvements #650
release version v4.0.0 by @richox in release version v4.0.0 #613
replace gxhash with foldhash by @richox in replace gxhash with foldhash #624
supports specialized count(0) by @richox in supports specialized count(0) #619
tpcd benchmarkrunner : add orc format support by @leizhang5s in tpcd benchmarkrunner : add orc format support #639
update to datafusion-v42 by @richox in update to datafusion-v42 #574
use custom implemented hashmap for aggregation by @richox in use custom implemented hashmap for aggregation #617

richox mentioned this issue Dec 10, 2024

release version v4.0.1 #690

Merged

lihao712 closed this as completed in #690 Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preparing v4.0.1 #684

Preparing v4.0.1 #684

richox commented Dec 6, 2024

Preparing v4.0.1 #684

Preparing v4.0.1 #684

Comments

richox commented Dec 6, 2024

New Feature

Improvement

Bug fixes

Other

PRs