Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CALCITE-6465: Rework code generator to use Flink code splitter #3901

Open
wants to merge 30 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
6ac2d2e
WIP: Integrate Flink code splitter
jduo Aug 2, 2024
c63821e
Remove exclusion of flink-core
jduo Aug 2, 2024
5556360
Fix format violations
jduo Aug 21, 2024
d3cf335
Refactor so that only methods get code-split
jduo Aug 21, 2024
739c747
Remove extra newline
jduo Aug 21, 2024
e9e20e1
Update calls to Expressions.toString()
jduo Sep 10, 2024
dc783a9
Merge fixes
jduo Sep 18, 2024
b7553e4
Fix whitespace differences in generated code
jduo Sep 19, 2024
f63b665
Checker framework
jduo Sep 19, 2024
4216e80
Autostyle
jduo Sep 19, 2024
2077d1e
Parameterize use of method splitting
jduo Sep 19, 2024
fd2de3d
Enable method splitting in linq4j tests
jduo Sep 20, 2024
8f04243
Make method-splitting in library code based on a CalciteSystemProperty
jduo Sep 20, 2024
dd5b995
Fix commented-out code
jduo Sep 20, 2024
8f62b88
Revert unintended change around assert
jduo Sep 20, 2024
1766b35
Revert unintended change
jduo Sep 20, 2024
cd33092
Fix merge issues
jduo Sep 20, 2024
dc1c66d
PR feedback
jduo Sep 24, 2024
2ce8cac
Turn off method-splitting for EnumUtils tests
jduo Sep 25, 2024
6d545db
Make method splitting happen on class declarations
jduo Sep 27, 2024
6200d82
Simplify the test case
jduo Sep 27, 2024
39e14a4
Fix typo in comment
jduo Sep 28, 2024
d969ee4
Make configuration of method splitting a threshold instead of a toggle
jduo Sep 28, 2024
232eb8e
Revert accidental change
jduo Sep 28, 2024
2887c0d
Fix some bugs introduced during refactoring
jduo Sep 28, 2024
fdd0d18
Fix Javadoc linter errors
jduo Sep 30, 2024
80c8587
PR feedback
jduo Oct 3, 2024
4a258d8
Clarify testLargeMethod
jduo Oct 4, 2024
c28f962
Add comments about circular dependency concerns
jduo Oct 4, 2024
4f95aa0
Expand on method-splitting tests
jduo Oct 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions bom/build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,7 @@ dependencies {
apiv("org.apache.commons:commons-pool2")
apiv("org.apache.commons:commons-collections4")
apiv("org.apache.commons:commons-text")
apiv("org.apache.flink:flink-table-code-splitter")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a note above this list of declarations that we import flink-table-code-splitter but we must not import core flink modules.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

apiv("org.apache.geode:geode-core")
apiv("org.apache.hadoop:hadoop-client", "hadoop")
apiv("org.apache.hadoop:hadoop-common", "hadoop")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,9 @@ public static Bindable toBindable(Map<String, Object> parameters,
parameters);

final ClassDeclaration expr = relImplementor.implementRoot(rel, prefer);
String s = Expressions.toString(expr.memberDeclarations, "\n", false);
String s =
Expressions.toString(expr.memberDeclarations, "\n", false,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that code split is not effective here. This is because the code splitting logic in this pr is implemented in ClassDeclaration, and the expr.memberDeclarations here is usually MethodDeclaration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asdfgh19 , thanks for your comments. Would you be able to add tests for your scenarios to the PR?

For this one, it seems like we a can pass expr to Expressions.toString(), but I'm not sure what effect that has on the calls that depend on s ( Hook.JAVA_PLAN.run(s);)

CalciteSystemProperty.MAX_METHOD_LENGTH_IN_CHARS_BEFORE_SPLITTING.value());

if (CalciteSystemProperty.DEBUG.value()) {
Util.debugCode(System.out, s);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@
*/
package org.apache.calcite.config;

import org.apache.calcite.linq4j.tree.Expressions;

import com.google.common.collect.ImmutableSet;

import org.checkerframework.checker.nullness.qual.Nullable;
Expand Down Expand Up @@ -420,6 +422,23 @@ public final class CalciteSystemProperty<T> {
public static final CalciteSystemProperty<Integer> FUNCTION_LEVEL_CACHE_MAX_SIZE =
intProperty("calcite.function.cache.maxSize", 0, v -> v >= 0);

/**
* The length of code in characters (inclusive) before automatic method splitting is enabled.
* If the length of a generated is equal or less than this value, automatic method splitting is
* disabled.
*
* <p>A value of {@link Expressions#DISABLE_METHOD_SPLITTING}, the default, disables automatic
* method splitting.
*
* <p>Some queries can generate methods exceeding the default JIT limit of 4000 characters per
* method. Use this setting to automatically detect and split methods larger than the specified
* limit.
*/
public static final CalciteSystemProperty<Integer> MAX_METHOD_LENGTH_IN_CHARS_BEFORE_SPLITTING =
intProperty("calcite.linq.method_splitting_character_limit",
Expressions.DISABLE_METHOD_SPLITTING,
v -> v == Expressions.DISABLE_METHOD_SPLITTING || v >= 0);

private static CalciteSystemProperty<Boolean> booleanProperty(String key,
boolean defaultValue) {
// Note that "" -> true (convenient for command-lines flags like '-Dflag')
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,9 @@ static Scalar.Producer baz(ParameterExpression context_,
final ClassDeclaration classDeclaration =
Expressions.classDecl(Modifier.PUBLIC, "Buzz", null,
ImmutableList.of(Scalar.Producer.class), declarations);
String s = Expressions.toString(declarations, "\n", false);
String s =
Expressions.toString(declarations, "\n", false,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The declarations here usually do not look like ClassDeclarations either.

CalciteSystemProperty.MAX_METHOD_LENGTH_IN_CHARS_BEFORE_SPLITTING.value());
if (CalciteSystemProperty.DEBUG.value()) {
Util.debugCode(System.out, s);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,9 @@ private static String compile(RexBuilder rexBuilder, List<RexNode> constExps,
Expressions.methodDecl(Modifier.PUBLIC, Object[].class,
BuiltInMethod.FUNCTION1_APPLY.method.getName(),
ImmutableList.of(root0_), blockBuilder.toBlock());
String code = Expressions.toString(methodDecl);
String code =
Expressions.toString(methodDecl,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The methodDecl here is a MethodDeclaration, not a ClassDeclaration, so it seems that the code splitting logic implemented in the ClassDeclaration does not work here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this code relates to RexExecutable. The method body is generated here and written to String, then gets attached as the body of a class in RexExecutable that's run directly in Janino's IClassBodyEvaluator. This will be quite complicated to rework.

CalciteSystemProperty.MAX_METHOD_LENGTH_IN_CHARS_BEFORE_SPLITTING.value());
if (CalciteSystemProperty.DEBUG.value()) {
Util.debugCode(System.out, code);
}
Expand Down
1 change: 1 addition & 0 deletions gradle.properties
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@ dropwizard-metrics.version=4.0.5
# do not upgrade this, new versions are Category X license.
elasticsearch.version=7.10.2
embedded-redis.version=0.6
flink-table-code-splitter.version=1.20.0
jts-core.version=1.19.0
jts-io-common.version=1.19.0
proj4j.version=1.2.2
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,9 @@ static List<String> innodbFieldNames(final RelDataType rowType) {
InnodbMethod.INNODB_QUERYABLE_QUERY.method, fields,
selectFields, cond, ascOrder));
if (CalciteSystemProperty.DEBUG.value()) {
System.out.println("Innodb: " + Expressions.toString(enumerable));
System.out.println(
"Innodb: " + Expressions.toString(enumerable,
CalciteSystemProperty.MAX_METHOD_LENGTH_IN_CHARS_BEFORE_SPLITTING.value()));
}
list.add(Expressions.return_(null, enumerable));
return implementor.result(physType, list.toBlock());
Expand Down
4 changes: 4 additions & 0 deletions linq4j/build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,8 @@ dependencies {

implementation("com.google.guava:guava")
implementation("org.apache.calcite.avatica:avatica-core")
// Note that only the flink-table-code-splitter module (and its dependencies) is imported
// from Flink. Other Flink modules have Calcite as a dependency and can introduce circular
// dependencies.
implementation("org.apache.flink:flink-table-code-splitter")
}
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,8 @@ public Type getType() {
}

@Override public String toString() {
ExpressionWriter writer = new ExpressionWriter(true);
// Use the JVM limit for method size (4K) as the method splitting threshold.
ExpressionWriter writer = new ExpressionWriter();
accept(writer, 0, 0);
return writer.toString();
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@
*/
package org.apache.calcite.linq4j.tree;

import org.apache.commons.lang3.StringUtils;
import org.apache.flink.table.codesplit.JavaCodeSplitter;

import org.checkerframework.checker.nullness.qual.Nullable;

import java.lang.reflect.Modifier;
Expand Down Expand Up @@ -46,19 +49,35 @@ public ClassDeclaration(int modifier, String name, @Nullable Type extended,
}

@Override public void accept(ExpressionWriter writer) {
// Conditionally serialize the class declaration directly to the supplied writer if method
// splitting is disabled or to a temporary writer that will generate code for the method that
// can be split before being serialized to the supplied writer.
final ExpressionWriter writerForClassWithoutSplitting = writer.usesMethodSplitting()
? writer.duplicateState() : writer;

String modifiers = Modifier.toString(modifier);
writer.append(modifiers);
writerForClassWithoutSplitting.append(modifiers);
if (!modifiers.isEmpty()) {
writer.append(' ');
writerForClassWithoutSplitting.append(' ');
}
writer.append(classClass).append(' ').append(name);
writerForClassWithoutSplitting.append(classClass).append(' ').append(name);
if (extended != null) {
writer.append(" extends ").append(extended);
writerForClassWithoutSplitting.append(" extends ").append(extended);
}
if (!implemented.isEmpty()) {
writer.list(" implements ", ", ", "", implemented);
writerForClassWithoutSplitting.list(" implements ", ", ", "", implemented);
}
writerForClassWithoutSplitting.list(" {\n", "", "}", memberDeclarations);

if (writer.usesMethodSplitting()) {
final int defaultMaxMembersGeneratedCode = 10000;

writer.append(
StringUtils.stripStart(
JavaCodeSplitter.split(writerForClassWithoutSplitting.toString(),
writer.getMaxMethodLengthInChars(), defaultMaxMembersGeneratedCode),
" "));
}
writer.list(" {\n", "", "}", memberDeclarations);
writer.newlineAndIndent();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,15 +32,31 @@ class ExpressionWriter {

private final Spacer spacer = new Spacer(0);
private final StringBuilder buf = new StringBuilder();
private boolean indentPending;
private final boolean generics;

/** The maximum number of characters (inclusive) in a method before method splitting is
* enabled. A value of {@link Expressions#DISABLE_METHOD_SPLITTING} indicates that there is no
* limit.
*/
private final int maxMethodLengthInChars;
private boolean indentPending;

ExpressionWriter() {
this(true);
this(true, Expressions.DISABLE_METHOD_SPLITTING);
}

ExpressionWriter(boolean generics) {
/**
* Creates an ExpressionWriter that can optionally emit generics and split methods.
*
* @param generics Indicates if generic type arguments should be written.
* @param maxMethodLengthInChars The maximum number of characters (inclusive) in a generated
* method before the method is split up into smaller methods.
* A value of {@link Expressions#DISABLE_METHOD_SPLITTING}
* indicates that there is no limit to method size.
*/
ExpressionWriter(boolean generics, int maxMethodLengthInChars) {
this.generics = generics;
this.maxMethodLengthInChars = maxMethodLengthInChars;
}

public void write(Node expression) {
Expand Down Expand Up @@ -73,6 +89,14 @@ public boolean requireParentheses(Expression expression, int lprec,
return true;
}

public boolean usesMethodSplitting() {
return maxMethodLengthInChars != Expressions.DISABLE_METHOD_SPLITTING;
}

public int getMaxMethodLengthInChars() {
return maxMethodLengthInChars;
}

/**
* Increases the indentation level.
*/
Expand Down Expand Up @@ -196,4 +220,12 @@ public void backUp() {
indentPending = false;
}
}

ExpressionWriter duplicateState() {
final ExpressionWriter writer =
new ExpressionWriter(this.generics, this.maxMethodLengthInChars);
writer.indentPending = this.indentPending;
writer.spacer.add(this.spacer.get());
return writer;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,11 @@
* Utility methods for expressions, including a lot of factory methods.
*/
public abstract class Expressions {
/**
* Used to disable method splitting when generating Java code with {@link #toString(Node, int)}.
*/
public static final int DISABLE_METHOD_SPLITTING = -1;

private Expressions() {}

/**
Expand All @@ -59,7 +64,19 @@ private Expressions() {}
*/
public static String toString(List<? extends Node> expressions, String sep,
boolean generics) {
final ExpressionWriter writer = new ExpressionWriter(generics);
return toString(expressions, sep, generics, Expressions.DISABLE_METHOD_SPLITTING);
}

/**
* Converts a list of expressions to Java source code, optionally emitting
* extra type information in generics. Optionally splits the generated method if
* the method exceeds the length specified in maxMethodLengthInChars.
* A maxMethodLengthInChars value of {@link #DISABLE_METHOD_SPLITTING}
* disables method splitting.
*/
public static String toString(List<? extends Node> expressions, String sep,
boolean generics, int maxMethodLengthInChars) {
final ExpressionWriter writer = new ExpressionWriter(generics, maxMethodLengthInChars);
for (Node expression : expressions) {
writer.write(expression);
writer.append(sep);
Expand All @@ -71,7 +88,17 @@ public static String toString(List<? extends Node> expressions, String sep,
* Converts an expression to Java source code.
*/
public static String toString(Node expression) {
return toString(Collections.singletonList(expression), "", true);
return toString(expression, Expressions.DISABLE_METHOD_SPLITTING);
}

/**
* Converts an expression to Java source code.
* Optionally splits the generated method if the method exceeds the length specified in
* maxMethodLengthInChars. A maxMethodLengthInChars value of {@link #DISABLE_METHOD_SPLITTING}
* disables method splitting.
*/
public static String toString(Node expression, int maxMethodLengthInChars) {
return toString(Collections.singletonList(expression), "", true, maxMethodLengthInChars);
}

/**
Expand Down
Loading