-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CALCITE-6465: Rework code generator to use Flink code splitter #3901
base: main
Are you sure you want to change the base?
Changes from 17 commits
6ac2d2e
c63821e
5556360
d3cf335
739c747
e9e20e1
dc783a9
b7553e4
f63b665
4216e80
2077d1e
fd2de3d
8f04243
dd5b995
8f62b88
1766b35
cd33092
dc1c66d
2ce8cac
6d545db
6200d82
39e14a4
d969ee4
232eb8e
2887c0d
fdd0d18
80c8587
4a258d8
c28f962
4f95aa0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -111,7 +111,10 @@ public static Bindable toBindable(Map<String, Object> parameters, | |
parameters); | ||
|
||
final ClassDeclaration expr = relImplementor.implementRoot(rel, prefer); | ||
String s = Expressions.toString(expr.memberDeclarations, "\n", false); | ||
String s = | ||
Expressions.toString(expr.memberDeclarations, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. put There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
"\n", false, | ||
CalciteSystemProperty.ENABLE_METHOD_SPLITTING.value()); | ||
|
||
if (CalciteSystemProperty.DEBUG.value()) { | ||
Util.debugCode(System.out, s); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -420,6 +420,13 @@ public final class CalciteSystemProperty<T> { | |
public static final CalciteSystemProperty<Integer> FUNCTION_LEVEL_CACHE_MAX_SIZE = | ||
intProperty("calcite.function.cache.maxSize", 0, v -> v >= 0); | ||
|
||
/** Whether to enable automatic method splitting when generating Java code. | ||
* | ||
* <p>Some queries can generate methods exceeding the JVM limit of 4000 characters per method. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you include a citation for your statement "JVM limit of 4000 characters per method"? Java 7 had a limit 65534 bytes (bytecode size). When did that change? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I got this 4K from Flink's documentation on the configuration option relating to this: So it's not exactly that the JVM has a limit of 4K, it's that the default JIT allows 8K only. Let me find a source for this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's specified here in OpenJDK (other vendors use the same limit): https://github.com/openjdk/jdk11u-dev/blob/a754a3d8972abc67aee768d31a2dc2e8214274cf/src/hotspot/share/runtime/globals.hpp#L2391 It's considered a developer option though, rather than an end-user option, so documentation is limited. |
||
* Enable this feature to automatically detect and split methods larger than the limit. */ | ||
public static final CalciteSystemProperty<Boolean> ENABLE_METHOD_SPLITTING = | ||
booleanProperty("calcite.linq.enable_method_splitting", false); | ||
|
||
private static CalciteSystemProperty<Boolean> booleanProperty(String key, | ||
boolean defaultValue) { | ||
// Note that "" -> true (convenient for command-lines flags like '-Dflag') | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -186,7 +186,9 @@ static Scalar.Producer baz(ParameterExpression context_, | |
final ClassDeclaration classDeclaration = | ||
Expressions.classDecl(Modifier.PUBLIC, "Buzz", null, | ||
ImmutableList.of(Scalar.Producer.class), declarations); | ||
String s = Expressions.toString(declarations, "\n", false); | ||
String s = | ||
Expressions.toString(declarations, "\n", false, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The declarations here usually do not look like ClassDeclarations either. |
||
CalciteSystemProperty.ENABLE_METHOD_SPLITTING.value()); | ||
if (CalciteSystemProperty.DEBUG.value()) { | ||
Util.debugCode(System.out, s); | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -51,7 +51,7 @@ public Type getType() { | |
} | ||
|
||
@Override public String toString() { | ||
ExpressionWriter writer = new ExpressionWriter(true); | ||
ExpressionWriter writer = new ExpressionWriter(true, true); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ideally this would read from the CalciteSystemProperty but calcite-core depends on linq4j so it can't be looked up here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will this have any practical impact on the actual code generation? (like splitting in some circumstances where the CalciteSystemProperty has not been set to true). If not, I'd say this is acceptable. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I do not know this code, but wonder if the split configuration values be passed by value into the linq4j layer. If possible , this would avoid the dependancy recursion. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think this toString() method is used in code generation, just logging/debugging. |
||
accept(writer, 0, 0); | ||
return writer.toString(); | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -32,15 +32,24 @@ class ExpressionWriter { | |
|
||
private final Spacer spacer = new Spacer(0); | ||
private final StringBuilder buf = new StringBuilder(); | ||
private boolean indentPending; | ||
private final boolean generics; | ||
private final boolean methodSplitting; | ||
private boolean indentPending; | ||
|
||
ExpressionWriter() { | ||
this(true); | ||
this(true, true); | ||
} | ||
|
||
ExpressionWriter(boolean generics) { | ||
ExpressionWriter(boolean generics, boolean methodSplitting) { | ||
this.generics = generics; | ||
this.methodSplitting = methodSplitting; | ||
} | ||
|
||
public ExpressionWriter duplicateState() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. public method needs javadoc There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I made this package-private rather than adding javadoc -- it really seems like most of the methods in ExpressionWriter can be package-private. |
||
final ExpressionWriter writer = new ExpressionWriter(this.generics, this.methodSplitting); | ||
writer.indentPending = this.indentPending; | ||
writer.spacer.add(this.spacer.get()); | ||
return writer; | ||
} | ||
|
||
public void write(Node expression) { | ||
|
@@ -73,6 +82,10 @@ public boolean requireParentheses(Expression expression, int lprec, | |
return true; | ||
} | ||
|
||
public boolean usesMethodSplitting() { | ||
return methodSplitting; | ||
} | ||
|
||
/** | ||
* Increases the indentation level. | ||
*/ | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,6 +16,9 @@ | |
*/ | ||
package org.apache.calcite.linq4j.tree; | ||
|
||
import org.apache.commons.lang3.StringUtils; | ||
import org.apache.flink.table.codesplit.JavaCodeSplitter; | ||
|
||
import org.checkerframework.checker.nullness.qual.Nullable; | ||
|
||
import java.lang.reflect.Modifier; | ||
|
@@ -57,20 +60,39 @@ public MethodDeclaration(int modifier, String name, Type resultType, | |
} | ||
|
||
@Override public void accept(ExpressionWriter writer) { | ||
String modifiers = Modifier.toString(modifier); | ||
writer.append(modifiers); | ||
final ExpressionWriter writerForUnsplitMethod = writer.usesMethodSplitting() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have the impression the intent of this line is not very obvious, perhaps a comment would help to clarify it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done in dc1c66d There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @jduo , just a small comment, because I find the comment a bit confusing (maybe I'm getting it wrong), where it says:
shouldn't it say "disabled" ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes that's correct. I've updated this comment, but note that it is now in ClassDeclaration instead of MethodDeclaration. The method splitter requires an enclosing class to write some of its output too (it can extract variable names to instance variables and move blocks of code to generated private methods to reduce the size of a method). |
||
? writer.duplicateState() : writer; | ||
|
||
final String modifiers = Modifier.toString(modifier); | ||
writerForUnsplitMethod.append(modifiers); | ||
if (!modifiers.isEmpty()) { | ||
writer.append(' '); | ||
writerForUnsplitMethod.append(' '); | ||
} | ||
//noinspection unchecked | ||
writer | ||
writerForUnsplitMethod | ||
.append(resultType) | ||
.append(' ') | ||
.append(name) | ||
.list("(", ", ", ")", | ||
() -> (Iterator) parameters.stream().map(ParameterExpression::declString).iterator()) | ||
.append(' ') | ||
.append(body); | ||
|
||
if (writer.usesMethodSplitting()) { | ||
// Specifies a threshold where generated code will be split into sub-function calls. | ||
// Java has a maximum method length of 64 KB. This setting allows for finer granularity if | ||
// necessary. | ||
// Default value is 4000 instead of 64KB as by default JIT refuses to work on methods with | ||
// more than 8K byte code. | ||
final int defaultMaxGeneratedCodeLength = 4000; | ||
final int defaultMaxMembersGeneratedCode = 10000; | ||
|
||
writer.append( | ||
StringUtils.stripStart( | ||
JavaCodeSplitter.split(writerForUnsplitMethod.toString(), | ||
defaultMaxGeneratedCodeLength, defaultMaxMembersGeneratedCode), | ||
" ")); | ||
} | ||
writer.newlineAndIndent(); | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a note above this list of declarations that we import flink-table-code-splitter but we must not import core flink modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.