Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm2c: Add macro and tests to disable Wasm's force_read #2357

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

shravanrn
Copy link
Collaborator

@shravanrn shravanrn commented Dec 15, 2023

Add macro WASM_RT_NONCONFORMING_DISABLE_FORCE_READ to allow the embedder to disable force reads

Depends on #2356 (Only see the changes in the last commit in this PR)

@shravanrn shravanrn requested review from keithw and sbc100 December 15, 2023 08:25
@shravanrn shravanrn force-pushed the forceread_nonconform branch from c98c303 to c347198 Compare December 15, 2023 08:32
@shravanrn shravanrn changed the title wasm2c: Add macro and tests to allow disabling stack exhaustion checks wasm2c: Add macro and tests to disable Wasm's force_read Dec 15, 2023
@sbc100
Copy link
Member

sbc100 commented Dec 15, 2023

@matthias-blume will like this I think? Isn't this one of the patches you guys already hold in your fork?

@shravanrn shravanrn force-pushed the forceread_nonconform branch from c347198 to bd7bcdd Compare December 15, 2023 18:58
@keithw
Copy link
Member

keithw commented Dec 15, 2023

Looks reasonable, but let's iterate on the name a bit. I'd love for these hooks to describe basically what part of the spec we're violating conformance with, rather than the particular internal mechanism we end up skipping.

Functionally what this would do is violate conformance with Wasm's requirement that an OOB memory load must produce a trap, even if the load can safely be optimized out. It's basically a determinism thing (while preserving sandboxing/safety).

@keithw
Copy link
Member

keithw commented Dec 15, 2023

Also @sunfishcode if you still hold the view you did in #1432 about whether a W3C-hosted project should add hooks that allow a host to request nonconformance, would love to hear your thoughts now while this is under consideration.

@shravanrn
Copy link
Collaborator Author

shravanrn commented Dec 15, 2023

let's iterate on the name a bit.

@keithw How about WASM_RT_NONCONFORMING_DISABLE_UNUSED_OOB_READ_TRAP? It's a bit long but hopefully captures the meaning.

@shravanrn
Copy link
Collaborator Author

shravanrn commented Dec 15, 2023

@keithw Also a separate question here. Is the force_read needed when we do explicit bounds checks? In theory, the compiler should eliminate unused reads only when we use guard pages --- should we consider disabling force_read when we do bounds checks? The tests with explicit bounds checks without force_read seem to pass on my Ubuntu machine

@sbc100
Copy link
Member

sbc100 commented Dec 15, 2023

@keithw Also a separate question here. Is the force_read needed when we do explicit bounds checks? In theory, the compiler should eliminate unused reads only when we use guard pages --- should we consider disabling force_read when we do bounds checks? The tests with explicit bounds checks without force_read seem to pass on my Ubuntu machine

That sounds right to me.

@keithw
Copy link
Member

keithw commented Dec 15, 2023

Agreed we don't need force_read with explicit bounds checking (although would prefer not to make that change at the same time as adding nonconformance).

On the name, it seems like what we're doing is allowing nondeterministic elision of unobserved memory loads (and not really as part of the runtime). "Unobserved" meaning that the semantics are invariant to the value produced by the load as long as there is one. How about... WASM2C_NONCONFORMING_LOAD_ELISION ?

@shravanrn shravanrn force-pushed the forceread_nonconform branch from bd7bcdd to 4151fe0 Compare December 15, 2023 22:57
@shravanrn
Copy link
Collaborator Author

shravanrn commented Dec 15, 2023

@keithw Done! I've also migrated the run-spec-tests.py changes to this PR since they weren't needed in the #2356. Could you sign off on the change when you have a chance or are we waiting to hear from @sunfishcode?

@sunfishcode
Copy link
Member

sunfishcode commented Dec 15, 2023

WebAssembly is a brand, and it is a brand that all of us working in this ecosystem share. "WebAssembly is sandboxed" is part of the brand. Even if you have for no need for this, if you weaken this brand, you weaken it for all of us that share it.

I don't want users comparing benchmarks of unsandboxed "WebAssembly" against conforming Wasm engines. I don't want engines to face competitive pressure to add options to disable their sandboxes. I don't want numbers published in academic papers about "WebAssembly" to lack sandboxing. We can't outright forbid things, but we can refrain from promoting them, and we can work to create norms where sandboxing isn't seen as slow-mode WebAssembly.

It's interesting to think about what might be possible using the techniques from MinSFI, because it seems like it would be technically superior for what I understand these wasm2c use cases need. There'd be no need to go through Wasm's idea of portability or sandboxing constructs other than the minimum needed. You could use things like architecture-specific intrinsics, architecure-specific optimization options, architecture-specific #ifdefs in the source code, and so on.

@shravanrn
Copy link
Collaborator Author

shravanrn commented Dec 15, 2023

WebAssembly is a brand, and it is a brand that all of us working in this ecosystem share. "WebAssembly is sandboxed" is part of the brand.

@sunfishcode To be clear, this does not remove the sandboxing. This change would break spec compliance, but will not break sandboxing. The worse case scenario here is that OOB reads which would normally trap, are entirely eliminated if unused.

I have more thoughts, but I wanted to ask if this changes anything in your response first.

@keithw
Copy link
Member

keithw commented Dec 16, 2023

@sunfishcode, for context we're talking about two different nonconforming proposals.

In #2356 there's a proposal to let the runtime implementation not verify the embedder's ability to catch and deliver a trap in the case of stack exhaustion. On Linux and Mac, it can do this in a zero-cost manner by catching a segfault on an altstack and delivering a trap, but on Windows my understanding is that stack overflow can trigger the fault but our spectest runner doesn't have a way to catch it (and so stack-depth counting is how it passes the spec tests on Windows). The #2356 proposal would instruct the runtime to allow the embedder to promise that it's willing to receive a segfault (or whatever Windows calls it) in the case of stack exhaustion. Maybe a Windows expert can make a language-lawyer argument that this is still conforming; stack exhaustion still produces some sort Windows trap, but not one our spectest runner knows how to recover from. (And to be clear -- only stack exhaustion is at issue here; memory loads/stores, call_indirect, etc. aren't affected.)

In this PR (#2357) the proposal is to compromise determinism rather than isolation. The idea is to let the consumer nondeterministically elide memory loads when the load's value is unobservable, i.e. when the semantics are invariant to the value produced as long as there is one. Currently, wasm2c prevents the C compiler from optimizing out the load by inserting inline asm that forces the result of the load to be live (even if it could otherwise be elided in an optimization pass). But there's a substantial performance hit from doing this. I suspect that at least some of the other LLVM WebAssembly engines may not be so careful about spec conformance in this area. I think this one could plausibly become a formal proposal, maybe by extending memtype to indicate a tolerance for (nondeterministic) load elision for loads on that memory and then providing a tool to set this bit on a module's memories without having to parse the code section.

@sunfishcode
Copy link
Member

@sunfishcode To be clear, this does not remove the sandboxing. This change would break spec compliance, but will not break sandboxing. The worse case scenario here is that OOB reads which would normally trap, are entirely eliminated if unused.

I have more thoughts, but I wanted to ask if this changes anything in your response first.

Ah, thanks for correcting me. That does make it less concerning, however I still do find it concerning.

@sunfishcode, for context we're talking about two different nonconforming proposals.

In #2356 there's a proposal to let the runtime implementation not verify the embedder's ability to catch and deliver a trap in the case of stack exhaustion. [...]

This situation sounds less worrisome. WebAssembly doesn't currently specify what a trap looks like to an embedder. This just sounds like something that embedders need to be aware of when using implementations that do that.

In this PR (#2357) the proposal is to compromise determinism rather than isolation.

The scenario is: someone builds a Wasm program that by accident does a wild store. This program happens to run fine in wasm2c with this flag, and then users come to depend on the flag. This puts pressure on other wasm engines to implement the flag as well.

I suspect that at least some of the other LLVM WebAssembly engines may not be so careful about spec conformance in this area.

You may be right, though we do have spec testsuite tests for the issues we're aware of, and we can add new tests when we learn of new issues. We can't outright enforce compliance, but we do have ways to promote it.

I think this one could plausibly become a formal proposal

Yes, this is ideally how the process should work. We should change the spec rather than bypassing it.

@keithw
Copy link
Member

keithw commented Dec 31, 2023

Great -- let's move ahead with #2356 then, and hold off on merging #2357 until one of us has posted in the design repo and socialized the idea more formally, perhaps to the point of a phase 0 proposal. Somebody (maybe me) should check how other optimizing engines (TurboFan, BaldrMonkey, Wasmer LLVM, WasmEdge, WAVM, ...) handle this.

In this PR (#2357) the proposal is to compromise determinism rather than isolation.

The scenario is: someone builds a Wasm program that by accident does a wild store. This program happens to run fine in wasm2c with this flag, and then users come to depend on the flag. This puts pressure on other wasm engines to implement the flag as well.

Just to be clear, a wild store isn't the issue here -- this is about a wild load whose value is unobserved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants