Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: pass scale to DF round in spark_round #1341

Merged
merged 8 commits into from
Feb 4, 2025
Merged

Conversation

cht42
Copy link
Contributor

@cht42 cht42 commented Jan 26, 2025

Which issue does this PR close?

Closes #1340.

Rationale for this change

What changes are included in this PR?

How are these changes tested?

DataType::Float32 | DataType::Float64 => {
Ok(ColumnarValue::Array(round(&[Arc::clone(array)])?))
}
DataType::Float32 | DataType::Float64 => Ok(ColumnarValue::Array(round(&[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI round for float is disabled https://github.com/apache/datafusion-comet/blob/main/spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala#L1739

BTW we should be able to use point instead of args[1]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

point is being set to the scalar value on this line and we want to pass the columnar value to DF

    let ColumnarValue::Scalar(ScalarValue::Int64(Some(point))) = point else 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in my use case, im not using comet's query planning so it should be fine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this isn't being tested end-to-end, could you add a Rust unit test so that we protect against regressions in the future?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still for the reason mentioned in the link, it may not work some cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added tests for floats

@codecov-commenter
Copy link

codecov-commenter commented Jan 27, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 39.16%. Comparing base (f09f8af) to head (ab5379b).
Report is 17 commits behind head on main.

Additional details and impacted files
@@              Coverage Diff              @@
##               main    #1341       +/-   ##
=============================================
- Coverage     56.12%   39.16%   -16.96%     
- Complexity      976     2069     +1093     
=============================================
  Files           119      262      +143     
  Lines         11743    60323    +48580     
  Branches       2251    12836    +10585     
=============================================
+ Hits           6591    23627    +17036     
- Misses         4012    32223    +28211     
- Partials       1140     4473     +3333     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@kazuyukitanimura kazuyukitanimura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok to accept this PR as round is disabled for floating points. Current round behavior is not Spark compatible and we will change the logic to match with the Spark behavior in the future though.


#[test]
fn test_round_f32() -> Result<()> {
let args = vec![
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we should have scalar test cases as well as random value tests and ideally we need expected values from Spark results

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we get it merged ? or do you want more tests ? also how would we use spark results in test ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cht42 merged.
For the Spark results, you would need to get them manually for testing in Rust.
We can enable tests like https://github.com/apache/datafusion-comet/blob/main/spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala#L1239 once we fully support round for floating points

@kazuyukitanimura kazuyukitanimura merged commit 3d9cbd5 into apache:main Feb 4, 2025
74 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scale argument is not passed to DF round in spark_round
4 participants