-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WebAssembly] Don't return multivalue when Emscripten EH/SjLj is used #86048
Conversation
@llvm/pr-subscribers-backend-webassembly Author: Heejin Ahn (aheejin) ChangesEmscripten EH/SjLj uses invoke wrappers in JS, such as But in case the invoked function returns multiple values or a 128-bit value, its the JS invoke wrappers cannot return multivalue because JS doesn't support that. So we should not enable multivalue returns for the JS invoke wrappers and also the functions called by those JS wrappers because their signature has to match with the JS wrapper. For example, if
we should eventually lower both the invoke wrapper (whose name will be So we need to disable multivalue returns for JS invoke wrappers and
This PR is doing 3. While it is the most conservative and possibly This is separate from whether we should make the multivalue return Full diff: https://github.com/llvm/llvm-project/pull/86048.diff 7 Files Affected:
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
index 03897b551ecaee..c6c7bf1cacda40 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
@@ -151,19 +151,14 @@ static std::string getEmscriptenInvokeSymbolName(wasm::WasmSignature *Sig) {
//===----------------------------------------------------------------------===//
MCSymbolWasm *WebAssemblyAsmPrinter::getMCSymbolForFunction(
- const Function *F, bool EnableEmEH, wasm::WasmSignature *Sig,
+ const Function *F, bool EnableEmEHSjLj, wasm::WasmSignature *Sig,
bool &InvokeDetected) {
MCSymbolWasm *WasmSym = nullptr;
- if (EnableEmEH && isEmscriptenInvokeName(F->getName())) {
+ if (EnableEmEHSjLj && isEmscriptenInvokeName(F->getName())) {
assert(Sig);
InvokeDetected = true;
- if (Sig->Returns.size() > 1) {
- std::string Msg =
- "Emscripten EH/SjLj does not support multivalue returns: " +
- std::string(F->getName()) + ": " +
- WebAssembly::signatureToString(Sig);
- report_fatal_error(Twine(Msg));
- }
+ // When Emscripten EH/SjLj is enabled, we don't enable multivalue returns
+ assert(Sig->Returns.size() <= 1);
WasmSym = cast<MCSymbolWasm>(
GetExternalSymbolSymbol(getEmscriptenInvokeSymbolName(Sig)));
} else {
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.h b/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.h
index c30e0155c81e72..4892df1f22485d 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.h
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.h
@@ -81,7 +81,7 @@ class LLVM_LIBRARY_VISIBILITY WebAssemblyAsmPrinter final : public AsmPrinter {
MVT getRegType(unsigned RegNo) const;
std::string regToString(const MachineOperand &MO);
WebAssemblyTargetStreamer *getTargetStreamer();
- MCSymbolWasm *getMCSymbolForFunction(const Function *F, bool EnableEmEH,
+ MCSymbolWasm *getMCSymbolForFunction(const Function *F, bool EnableEmEHSjLj,
wasm::WasmSignature *Sig,
bool &InvokeDetected);
MCSymbol *getOrCreateWasmSymbol(StringRef Name);
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
index 905ff3b9018428..64bcadf3f5677c 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
@@ -1288,7 +1288,7 @@ bool WebAssemblyTargetLowering::CanLowerReturn(
const SmallVectorImpl<ISD::OutputArg> &Outs,
LLVMContext & /*Context*/) const {
// WebAssembly can only handle returning tuples with multivalue enabled
- return Subtarget->hasMultivalue() || Outs.size() <= 1;
+ return WebAssembly::canLowerReturn(Outs.size(), Subtarget);
}
SDValue WebAssemblyTargetLowering::LowerReturn(
@@ -1296,7 +1296,7 @@ SDValue WebAssemblyTargetLowering::LowerReturn(
const SmallVectorImpl<ISD::OutputArg> &Outs,
const SmallVectorImpl<SDValue> &OutVals, const SDLoc &DL,
SelectionDAG &DAG) const {
- assert((Subtarget->hasMultivalue() || Outs.size() <= 1) &&
+ assert(WebAssembly::canLowerReturn(Outs.size(), Subtarget) &&
"MVP WebAssembly can only return up to one value");
if (!callingConvSupported(CallConv))
fail(DL, DAG, "WebAssembly doesn't support non-C calling conventions");
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyMachineFunctionInfo.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyMachineFunctionInfo.cpp
index 1e959111a4dbcb..2c104382aa65ca 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyMachineFunctionInfo.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyMachineFunctionInfo.cpp
@@ -17,6 +17,7 @@
#include "Utils/WebAssemblyTypeUtilities.h"
#include "WebAssemblyISelLowering.h"
#include "WebAssemblySubtarget.h"
+#include "WebAssemblyUtilities.h"
#include "llvm/CodeGen/Analysis.h"
#include "llvm/CodeGen/WasmEHFuncInfo.h"
#include "llvm/Target/TargetMachine.h"
@@ -70,12 +71,12 @@ void llvm::computeSignatureVTs(const FunctionType *Ty,
computeLegalValueVTs(ContextFunc, TM, Ty->getReturnType(), Results);
MVT PtrVT = MVT::getIntegerVT(TM.createDataLayout().getPointerSizeInBits());
- if (Results.size() > 1 &&
- !TM.getSubtarget<WebAssemblySubtarget>(ContextFunc).hasMultivalue()) {
+ if (!WebAssembly::canLowerReturn(
+ Results.size(),
+ &TM.getSubtarget<WebAssemblySubtarget>(ContextFunc))) {
// WebAssembly can't lower returns of multiple values without demoting to
- // sret unless multivalue is enabled (see
- // WebAssemblyTargetLowering::CanLowerReturn). So replace multiple return
- // values with a poitner parameter.
+ // sret unless multivalue is enabled (see WebAssembly::canLowerReturn). So
+ // replace multiple return values with a poitner parameter.
Results.clear();
Params.push_back(PtrVT);
}
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyUtilities.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyUtilities.cpp
index ac7cf5b37fcaa4..f09700ec7c0527 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyUtilities.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyUtilities.cpp
@@ -179,3 +179,17 @@ unsigned WebAssembly::getCopyOpcodeForRegClass(const TargetRegisterClass *RC) {
llvm_unreachable("Unexpected register class");
}
}
+
+bool WebAssembly::canLowerReturn(size_t ResultSize,
+ const WebAssemblySubtarget *Subtarget) {
+ // We don't return multivalues when either of Emscripten EH or Emscripten SjLj
+ // is used. JS invoke wrapper functions, which are used in Emscripten EH/SjLj,
+ // don't support multiple return values, and the functions indirectly called
+ // from those JS invoke wrapper functions can't be lowered using multivalue
+ // results because they have to match the signature of the JS invoke wrappers.
+ // We can come up with a precise set of functions that are called from the JS
+ // wrappers if we do some bookeeping, but given that Emscripten EH/SjLj is an
+ // old feature whose use is expected to decline, we don't think it's worth it.
+ return ResultSize <= 1 || (Subtarget->hasMultivalue() &&
+ !WebAssembly::WasmEnableEmEH && !WasmEnableEmSjLj);
+}
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyUtilities.h b/llvm/lib/Target/WebAssembly/WebAssemblyUtilities.h
index 7f28fb1858a690..b430edba4627ae 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyUtilities.h
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyUtilities.h
@@ -63,6 +63,11 @@ MachineInstr *findCatch(MachineBasicBlock *EHPad);
/// Returns the appropriate copy opcode for the given register class.
unsigned getCopyOpcodeForRegClass(const TargetRegisterClass *RC);
+/// Returns true if the function's return value(s) can be lowered directly,
+/// i.e., not indirectly via a pointer parameter that points to the value in
+/// memory.
+bool canLowerReturn(size_t ResultSize, const WebAssemblySubtarget *Subtarget);
+
} // end namespace WebAssembly
} // end namespace llvm
diff --git a/llvm/test/CodeGen/WebAssembly/lower-em-ehsjlj-multi-return.ll b/llvm/test/CodeGen/WebAssembly/lower-em-ehsjlj-multi-return.ll
index 4f33439db770dc..a4d0908299f4dc 100644
--- a/llvm/test/CodeGen/WebAssembly/lower-em-ehsjlj-multi-return.ll
+++ b/llvm/test/CodeGen/WebAssembly/lower-em-ehsjlj-multi-return.ll
@@ -1,8 +1,8 @@
-; RUN: not --crash llc < %s -enable-emscripten-cxx-exceptions -mattr=+multivalue 2>&1 | FileCheck %s --check-prefix=EH
-; RUN: not --crash llc < %s -enable-emscripten-sjlj -mattr=+multivalue 2>&1 | FileCheck %s --check-prefix=SJLJ
+; RUN: llc < %s -enable-emscripten-cxx-exceptions -enable-emscripten-sjlj -mattr=+multivalue | FileCheck %s
-; Currently multivalue returning functions are not supported in Emscripten EH /
-; SjLj. Make sure they error out.
+; Even with multivalue feature enabled, we don't enable multivalue returns when
+; Emscripten EH/SjLj is used. Also the invoke wrappers' (e.g. invoke_vi)
+; signature should not change.
target triple = "wasm32-unknown-unknown"
@@ -35,7 +35,32 @@ entry:
unreachable
}
+define void @invoke_i128() personality ptr @__gxx_personality_v0 {
+entry:
+ invoke i128 @get_i128()
+ to label %try.cont unwind label %lpad
+
+lpad: ; preds = %entry
+ %1 = landingpad { ptr, i32 }
+ catch ptr null
+ %2 = extractvalue { ptr, i32 } %1, 0
+ %3 = extractvalue { ptr, i32 } %1, 1
+ %4 = call ptr @__cxa_begin_catch(ptr %2)
+ call void @__cxa_end_catch()
+ br label %try.cont
+
+try.cont: ; preds = %entry, %lpad
+ ret void
+}
+
declare {i32, i32} @foo(i32)
+; CHECK-DAG: .functype foo (i32, i32) -> ()
+; CHECK-DAG: .functype invoke_vi (i32, i32) -> ()
+
+declare i128 @get_i128()
+; CHECK-DAG: .functype get_i128 (i32) -> ()
+; CHECK-DAG: .functype invoke_vii (i32, i32, i32) -> ()
+
declare i32 @__gxx_personality_v0(...)
declare ptr @__cxa_begin_catch(ptr)
declare void @__cxa_end_catch()
|
Emscripten EH/SjLj uses invoke wrappers in JS, such as `invoke_vi`, from which user functions are called indirectly, to emulate exceptions and setjmp/longjmp. But in case the invoked function returns multiple values or a 128-bit value, its the JS invoke wrappers cannot return multivalue because JS doesn't support that. So we should not enable multivalue returns for the JS invoke wrappers and also the functions called by those JS wrappers because their signature has to match with the JS wrapper. For example, if `func` returns `{i32, i32}` and we have ```ll invoke {i32, i32} @func() ... ``` while LowerEmscriptenEHSjLj will lower it down to something like ```ll %0 = call { i32, i32 } @"__invoke_{i32.i32}_void"(ptr @func) ... ``` we should eventually lower both the invoke wrapper (whose name will be changed later to `invoke_vi`) and `func` down to a signature that indirectly returns multiple values by memory parameter, because JS invoke wrappers do support multiple return values. So we need to disable multivalue returns for JS invoke wrappers and functions called by them. I think we have three ways to do that: 1. Make a set and add all functions that are invoked by JS invoke wrappers in LowerEmscriptenEHSjLj and pass it to the backend using an auxiliary data structure. We have a precedent of this kind of structure already, which is used for Wasm EH: https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/CodeGen/WasmEHFuncInfo.h We can even add this set to `WasmEHFuncInfo` maybe, given that this is also a way of handling exceptions in Wasm. - Pros: Most precise - Cons: Auxiliary structure needed 2. Unless we record the invoked functions in LowerEmscriptenEHSjLj like 1, we don't have a way of precisely knowing the set of invoked functions. But they are all indirectly invoked, so in the backend we can check whether a given function is ever indirectly used (i.e., its pointer taken) by traversing its `users()` in the IR and if it is, we don't allow multivalue returns for them. - Pros: No auxiliary structure - Cons: IR checking overhead. More conservative than 1. 3. Disallow all multivalue returns when Emscripten EH or SjLj is enabled. - Pros: Simplest - Cons: Most conservative. This PR is doing 3. While it is the most conservative and possibly disallowing multivalue returns from more functions than needed, I chose this because it is the simplest, and given that hopefully more people will adopt Wasm EH going forward, I don't think there will be many people who would use multivalue and Emscripten EH/SjLj together and want the whatever performance benefit that multivalue return can bring, given that Emscripten EH/SjLj has already a huge performance cost. This is separate from whether we should make the multivalue return dependent on the multivalue feature or something else like a clang flag, which is being partly discussed in WebAssembly/tool-conventions#223. Whichever way we decide on that front, we still need to disable multivalue returns in case Emscripten EH/SjLj is used.
2137144
to
6a6966e
Compare
cc @TerrorJack in case you are interested. If you are using emscripten this may affect your results, but I don't think you are..? |
Doesn't it? I thought multivalue interacts with JS by having multivalue function called from JS return JS arrays of values. |
One thing to think about is, given that Emscripten EH is off by default but Emscripten SjLj is on by default unless you use Wasm EH, this may turn off multivalue returns for more cases than we intend to. So to sum up,
|
So if you really want multivalue returns, the only way to get them after this change is by also enabling Wasm EH? |
@aheejin Thanks for the ping! Our toolchain does not use emscripten sjlj mechanism fyi. |
Hmm, yeah, you're right. It's not an inherent limitation of JS but rather our invoke wrapper-specific issue... |
Either that or explicitly changing longjmp support via |
Co-authored-by: Thomas Lively <[email protected]>
Will abandon this given that Emscripten does not need this for now after #88492 lands. |
Emscripten EH/SjLj uses invoke wrappers in JS, such as
invoke_vi
, from which user functions are called indirectly, to emulate exceptions and setjmp/longjmp.But in case the invoked function returns multiple values or a 128-bit value, its the JS invoke wrappers cannot return multivalue because JS doesn't support that. So we should not enable multivalue returns for the JS invoke wrappers and also the functions called by those JS wrappers because their signature has to match with the JS wrapper. For example, if
func
returns{i32, i32}
and we havewhile LowerEmscriptenEHSjLj will lower it down to something like
we should eventually lower both the invoke wrapper (whose name will be changed later to
invoke_vi
) andfunc
down to a signature that indirectly returns multiple values by memory parameter, because JS invoke wrappers do support multiple return values.So we need to disable multivalue returns for JS invoke wrappers and functions called by them. I think we have three ways to do that:
Make a set and add all functions that are invoked by JS invoke wrappers in LowerEmscriptenEHSjLj and pass it to the backend using an auxiliary data structure. We have a precedent of this kind of structure already, which is used for Wasm EH:
https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/CodeGen/WasmEHFuncInfo.h
We can even add this set to
WasmEHFuncInfo
maybe, given that this is also a way of handling exceptions in Wasm.Unless we record the invoked functions in LowerEmscriptenEHSjLj like 1, we don't have a way of precisely knowing the set of invoked functions. But they are all indirectly invoked, so in the backend we can check whether a given function is ever indirectly used (i.e., its pointer taken) by traversing its
users()
in the IR and if it is, we don't allow multivalue returns for them.Disallow all multivalue returns when Emscripten EH or SjLj is enabled.
This PR is doing 3. While it is the most conservative and possibly disallowing multivalue returns from more functions than needed, I chose this because it is the simplest, and given that hopefully more people will adopt Wasm EH going forward, I don't think there will be many people who would use multivalue and Emscripten EH/SjLj together and want the whatever performance benefit that multivalue return can bring, given that Emscripten EH/SjLj has already a huge performance cost.
This is separate from whether we should make the multivalue return dependent on the multivalue feature or something else like a clang flag, which is being partly discussed in WebAssembly/tool-conventions#223.
Whichever way we decide on that front, we still need to disable multivalue returns in case Emscripten EH/SjLj is used.