Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-6020] SqlToRelConverter should not replace windowed SUM with… #3882

Merged
merged 1 commit into from
Aug 21, 2024

Conversation

normanj-bitquill
Copy link
Contributor

… equivalent expression using SUM0

  • Add a new configuration option to RelBuilder.Config
    • Whether to change SUM to SUM0 in a window
  • Call over to RelBuilder to determine the operator to use when in a window but not a histogram

@mihaibudiu
Copy link
Contributor

This is supposedly fixing a bug. No test for that bug?

@@ -1475,6 +1475,16 @@ protected AggCall aggregateCall(SqlAggFunction aggFunction, boolean distinct,
filter, alias, preOperands, operands, distinctKeys, orderKeys);
}

public SqlAggFunction windowedAggregateOperator(SqlAggFunction aggFunction,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be a SqlToRelConverter config option?

@@ -6154,7 +6150,7 @@ private class HistogramShuttle extends RexShuttle {
: c.rangeBetween(lowerBound, upperBound))
.exclude(exclude)
.allowPartial(allowPartial)
.nullWhenCountZero(needSum0)
.nullWhenCountZero(aggOp == SqlStdOperatorTable.SUM && type.isNullable())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this will still add the count stuff if the actual relBuilder.windowedAggregateOperator decided to not override it?

@normanj-bitquill
Copy link
Contributor Author

I've reworked this PR. Now the conversion of SUM to SUM0 happens in a new RelRule.

The advantage is that a user can choose whether to include the new RelRule or not.
The downside is that a user could lose the SUM to SUM0 conversion on upgrade depending on how they build up their list of RelRules to use.

@@ -41,16 +43,17 @@ public class RexShuttle implements RexVisitor<RexNode> {
@Override public RexNode visitOver(RexOver over) {
boolean[] update = {false};
List<RexNode> clonedOperands = visitList(over.operands, update);
SqlAggFunction op = visitOverAggFunction(over.getAggOperator());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more suggestive name for the variable would be overAggregator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@@ -5790,7 +5790,7 @@ window w1 as (partition by job order by hiredate rows 2 preceding)]]>
</Resource>
<Resource name="plan">
<![CDATA[
LogicalProject(EXPR$0=[CASE(>(COUNT($5) OVER (PARTITION BY $2 ORDER BY $4 ROWS 2 PRECEDING), 0), $SUM0($5) OVER (PARTITION BY $2 ORDER BY $4 ROWS 2 PRECEDING), null:INTEGER)], EXPR$1=[/(CASE(>(COUNT(CAST($5):REAL NOT NULL) OVER (PARTITION BY $2 ORDER BY $4 ROWS 2 PRECEDING), 0), $SUM0(CAST($5):REAL NOT NULL) OVER (PARTITION BY $2 ORDER BY $4 ROWS 2 PRECEDING), null:REAL), COUNT(CAST($5):REAL NOT NULL) OVER (PARTITION BY $2 ORDER BY $4 ROWS 2 PRECEDING))])
LogicalProject(EXPR$0=[CASE(>(COUNT($5) OVER (PARTITION BY $2 ORDER BY $4 ROWS 2 PRECEDING), 0), SUM($5) OVER (PARTITION BY $2 ORDER BY $4 ROWS 2 PRECEDING), null:INTEGER)], EXPR$1=[/(CASE(>(COUNT(CAST($5):REAL NOT NULL) OVER (PARTITION BY $2 ORDER BY $4 ROWS 2 PRECEDING), 0), SUM(CAST($5):REAL NOT NULL) OVER (PARTITION BY $2 ORDER BY $4 ROWS 2 PRECEDING), null:REAL), COUNT(CAST($5):REAL NOT NULL) OVER (PARTITION BY $2 ORDER BY $4 ROWS 2 PRECEDING))])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these look like regressions to me. Do you expect this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe this is actually addressing the actual issue: you are saying that theses were incorrect to start with?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is sort of a regression. These tests only convert the SQL to Rel, but do not go through planning the query. As a result, they do not run the RelRules, so SUM is not converted to SUM0. There are other tests (in RelOptRulesTest) that also go through planning and verify that SUM becomes SUM0.


/**
* Planner rule that converts SUM to SUM0 when it is used in an OVER clause inside
* the project list.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add an argument why this is always safe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expanded on the Javadoc here to explain why this rule should be used. Let me know if more explanation is needed.

@mihaibudiu mihaibudiu added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Aug 20, 2024
A new RelRule was created that looks for OVER in a the project list. If the OVER contains SUM in the
aggregate, it is changed to SUM0. The conversion now happens during planning rather than when SQL is
converted to Rel.
Copy link

@normanj-bitquill
Copy link
Contributor Author

The commits have been squashed.

@mihaibudiu mihaibudiu merged commit 7a402e8 into apache:main Aug 21, 2024
19 checks passed
@normanj-bitquill normanj-bitquill deleted the calcite-6020 branch August 27, 2024 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LGTM-will-merge-soon Overall PR looks OK. Only minor things left.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants