-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLVM 19 and enabling reference-types
by default
#233
Comments
Yes we were aware of this slightly strange fallout of enabled reference-types. For example we had (within Google) reports from folks using wamr that led to bytecodealliance/wasm-micro-runtime#3726. Its a rather odd side effect of including multi-table as part of reference-types. Using a single byte relocation for the call_indirect immediate would be a fairly big change since it would be the first relocation that was anything but 5 bytes. Are you sure its worth this effort? I don't think the goal of llvm or wasm-ld should be to produce MVP output unless Isn't this just a one time cost for all tools to update their defaults to match llvm? At least for binaryen and emscripten we try to keep all our defaults in line with llvm's defaults so everyone is on the same page. I imagine each time llvm changes its defaults there will be such one-time costs for all downstream tools, no? |
Also, just in case you were not aware, the linker has Also, any optimized build that goes through a tool like wasm-opt will also no longer have these non-canonical LEBs. Just in case that helps anyone who might come accross this issue. |
+cc @sunfishcode For features generally, I don't actually agree that we want to only generate MVP modules unless newer features are requested. Reference types is maybe a special case (since most C/Rust style code doesn't benefit from it) but a better example is nontrapping float-to-int conversion. Enabling this by default will make basically nontrivial program (at least C programs, not sure about Rust) a bit smaller and maybe faster, and there's really no downside to enabling it. Given that many users will not want to go through the list of all features and try to reason out which ones they explicitly enable (and therefore many will just use the default plus whatever is actually necessary) I think we should, as a general practice, enable most features by default once support has reached some threshold. Exactly what the criteria should be (for support levels in the wild, benefits of the feature, etc) could be debatable, but I think the "default" practice should be to enable things unless there's a reason not to. Reference types specifically could maybe be a special case. It's true that C/Rust style code normally doesn't need it. You could imagine a solution like the one you mentioned; this seems sort of analogous to the small vs large code models supported on some architectures, where there are different size limitations on the resulting code in exchange for more efficient code generation in some cases. 127 tables does seem like enough for most purposes. Probably we just didn't think about this when we made the original design (maybe @pmatos remembers?) |
FWIW, I don't think we expected the call_indirect issue before we made the change, but I don't think we would have done much differently if we had anticipated it (except perhaps communicate more broadly ahead of the change). We certainly would have wanted to make the change eventually for the reasons @dschuff gave, and reference types have been standard for long enough now that I don't think waiting longer would have materially changed the outcome. I'm nervous about the idea of capping the number of supported tables at 127. That's certainly sufficient today, but we've had all kinds of ideas about using separate tables for separate function signatures for various purposes and I wouldn't want to create a new problem for those ideas to run into in the future. |
Also it's possible that we may eventually want to enable exception handling by default for C++ (or even for Rust panics?), and that will require reference types. |
That's a good point @dschuff and I should clarify that I do not want to shackle LLVM or anyone else to the wasm MVP, I think it's very good to enable new proposals by default as things move forward. I can perhaps clarify my stance though in that if a selected operation is MVP compatible I think it's beneficial to have it encoded in a way that's MVP compatible. For example a nontrapping float-to-int conversion is not MVP compatible so there's nothing that can be done about that. For
The main exception to this I can think of is build configuration. We support building Wasmtime without a Now this itself can be split into a number of different possible solutions:
None of these seem like the obvious solution per-se, but I also suspect that Wasmtime isn't the only runtime that may want to indefinitely provide a build mode where GC and/or For the leb/relocation limits, my rough idea is that a new 1-byte relocation would be added for table indices. I realize that would be quite different from what's implemented today, though. I forgot though that LLD supported
Good point! (and agreed that Rust will want to enable it by default one day too) IIRC though the exceptions proposal doesn't require at least the GC/ |
I agree with the sentiment that this is a specific case, and shouldn't generalize to every time any proposal is enabled by default in LLVM. I think what makes this case more complicated is that the user's source and LLVM's emitted modules aren't actually leveraging any new features unlocked by the newly-default-enabled proposal. The only results in this particular case are worse code size and less portability. If LLVM were taking advantage of newly-unlocked features and using Similarly, I would expect that in the far future when the custom-page-sizes proposal has been finished and merged into the core spec, |
I agree with this, but in this case it is not the linker that is encoding the call_indirect immediate but the compiler. When reference types are enabled the compiler doesn't know if some other translation unit will add additional tables so it has to do the conservative thing and add a relocation. All the linker does is apply the relocations that the compiler generates. |
I don't think the code size argument applies here since optimized binary should already be running though something like |
I think adding a new single-byte relocation would be easier than trying to do something like compress-relocations but in reverse. The single-byte relocation type might not be so bad. If anyone tries to use more than 127 tables the linker could emit a meaningful error such as "obj.o does not support more than 127 tables. Please recompile with with -msomething`. We could also use the same default for multi-memory where the number of memory relocations will likely be larger by an order of magnitude and the number of memories likely always very small. In order words, I'm coming around to the single-byte relocation idea.. |
Oh I didn't realize |
I assume the idea would be to backport this change to the LLVM 19 branch? |
If it's reasonable enough to do so that'd be great, but if it's large enough and/or risky enough I don't think it's the end of the world to update this in LLVM 20 instead |
In general, it would also be a good idea for engines to support encoding changes introduced by new proposals, even if they are configured not to support the new features and semantics of those proposals. |
I was wondering about that.. this would be an nice solution, but can you explain why you think its a good idea? I suppose this would also involve updating the core spec tests to match pending/new proposals in these respects? |
To be clear, I'm only talking about proposals that have become part of the standard. The core spec tests need to be updated for such proposals anyway. Accepting new encodings even without accepting new features 1) prevents problems like this one, and 2) simplifies decoders that no longer need to have different behavior in different configurations. |
Well updating wasmtime to unconditionally accept LEB encoding for this immediate certainly seems like the simplest solution then. Currently both wabt and wamr gate the encoding change on reftypes being enabled, but if it makes since to make this unconditional that is cleaner and more compatible. |
IIRC, there have been If we wait to make this sort of change until a proposal is merged into the main spec, that is one way to work around the issue. Or we could adopt a policy in the CG to avoid writing such tests and change/fix them where they do exist. |
I doubt we would ever consider turning on a feature by default in LLVM before it was fully merged in the main spec repo. I think we can expect a substantial lag between stage 4 approval and LLVM enabling a feature by default. |
What Nick brings up though isn't the only problem with newer proposals. One issue is that while a proposal is being developed, as he mentioned, there are often contradictory tests where the main test suite says "this must be invalid" where a proposal then says "those same bytes must be valid". This means that at the binary decoding layer we have to have some sort of conditional, so that's how proposals are all developed and everything starts out as these sort of conditional branches in the decoder. Later on after the feature has been merged into the spec then the question is whether the feature flag should be removed. So far Wasmtime/wasm-tools haven't removed any feature flags and support disabling features to go back to the MVP for example. This is not necessarily generally useful to everyone but certain embedders have expressed a desire to be able to do this indefinitely. For example some embedders want to be able to always disable the simd proposal. Wasmtime's build configuration requires being able to disable at least the Basically, at least for Wasmtime/wasm-tools, it so far hasn't been a safe assumption that phase 4+ proposals are unconditionally enabled and can't be disabled. If that were the case this would be less of an issue for Wasmtime, but there's still motivation to turn off different features of WebAssembly. Another possible very-abstract solution would be the "profiles" option for wasm where the GC profile for wasm would probably include |
I'm not suggesting that feature flags be removed, but only that the decoder (and no other part of the codebase) behave as though standardized features are always enabled. It's a good point that the decoder would still need feature flags to avoid breaking spec tests while supporting proposals that are not yet standard, though. In that case, I would be happy to update the upstream spec tests to remove the contradiction. Alternatively, the feature flag guards in the decoder could be removed once the proposal reaches phase 4. |
In wasm-tools it actually used to do that but was relatively recently modified. If spec tests were updated so they never contradicted between proposals I agree that would remove the need for such a change though. That being said I might have less confidence than you that such a change to how spec tests are written would be widely agreed upon, but I'd definitely be willing to voice my opinion in favor of such a change! |
If it helps, these are the issues that arise when binary decoding assumes all features are active:
This issue is then replicated across the extended-const, function-references, gc, and memory64 repositories which have the same test in their I was actually surprised that this was the only issue. Historically multi-memory and reference-types caused a lot of similar issues. The reference-types proposal has long since merged, but I would have thought that multi-memory would still be causing issues, but apparently not! |
FWIW, we're about to start importing the upstream spec tests into Binaryen as a submodule, so we'll start running into the same issues. Another way you can minimize pain here is by ignoring the error message text in the assertions; that text matches the errors from the spec interpreter, but there's no requirement that other engines produce the same error messages. |
Yeah, the encoding thing is an interesting problem. Generally speaking I do like the idea of engines expanding the allowable encodings even if a feature itself is disabled. It doesn't necessarily solve all the problems (e.g. there could be engines that just don't support a new feature at all; tools might still want a way to disable use of a feature), but would solve a class of issues once a feature becomes standard). And for a tool like wabt or wasm-tools, there is pretty limited value in emulating a mode where a feature is unsupported (compared to the value of just letting a decoder decode all the features it knows about). WRT how it's speced, part of this problem does go away when an extension gets merged into the spec, because the spec assumes that both the new encoding and the new functionality will be unconditionally valid, and currently doesn't have profiles or a similar way to reason about features being enabled or disabled. So IIUC there couldn't be an |
Just wanna note that the spec requires all implementations to support this, and has for years now, full stop. Either we call out implementations that don't follow this as buggy and make it their problem. Or we don't, but then the only way of doing so would be by officially introducing suitable profiles. |
If including reference-types feature in the build increases code size in wasm-tools, I agree w/ @dschuff that it can be a good idea that wasm-tools unconditionally decodes the I'm little reserved about the idea of making the relocation 1-byte from LLVM side and limit it to 127 tables based on the similar reasons @tlively suggested. Also as @rossberg said, eventually, the engines need to support this at some point. I don't think we are going to enable the new EH proposal by default in LLVM soon; given that it was standardized very recently we would likely wait for at least a few years before doing that. But it is possible that Rust or other toolchains would like to opt in to the feature before that because it gives better code size and performance, in which case the engines needs to support the LEB index anyway. |
Doesn't the linker already support mixing object files without relocations for In other words, what would be the cost of hard-coding the indirect function table to have index 0 always, in both the linker and the compiler? Edit: fwiw, this would be a real size regression in the default configuration (which enables debug info) in our toolchain. |
I had hoped this would be possible, but sadly the indirect function table index cannot be assumed to be at index zero. The reason is that imported tables always come before defined tables in the index space. So any program that imports a table for any kind will cause the (defined) function index table to have a non-zero index itself. In other words, you can't both import a table and have a defined table at index zero. |
Hmm, of course (obvious in retrospect :)). This also means that all object files built without reference types enabled (or ~= without table index relocations) are incompatible with links that have them enabled. It seems like quite a big breaking change, ameliorated perhaps somewhat by the rarity of imported tables. |
### What Add compile errors when built with unsupported wasm features reference-types and multivalue. ### Why Rust 1.82 is likely to ship with target feature reference-types and multivalue enabled on wasm builds. These target features are not supported by the Soroban Env in the current protocol 21 or next planned protocol 22. It's not trivial for developers to disable target features because of how the rustc compiler only exposes the ability to buildstd with nightly. These compile errors will prevent someone from building .wasm files with the sdk when either of those target features are enabled. A Rust version check is not being done because it's actually easier to do a target feature check, and in the off chain somehow Rust 1.82 shipped without the target features enabled then the sdk could still be used with 1.82. Links: - https://discord.com/channels/897514728459468821/1289090985569026048 - stellar/rs-soroban-env#1469 - WebAssembly/tool-conventions#233 ### Releasing This change will be merged to `main` targeting v22, then should be backported to a v21 patch release immediately following.
### What Add compile errors when built with unsupported wasm features reference-types and multivalue. ### Why Rust 1.82 is likely to ship with target feature reference-types and multivalue enabled on wasm builds. These target features are not supported by the Soroban Env in the current protocol 21 or next planned protocol 22. It's not trivial for developers to disable target features because of how the rustc compiler only exposes the ability to buildstd with nightly. These compile errors will prevent someone from building .wasm files with the sdk when either of those target features are enabled. A Rust version check is not being done because it's actually easier to do a target feature check, and in the off chain somehow Rust 1.82 shipped without the target features enabled then the sdk could still be used with 1.82. Links: - https://discord.com/channels/897514728459468821/1289090985569026048 - stellar/rs-soroban-env#1469 - WebAssembly/tool-conventions#233 ### Releasing This change will be merged to `main` targeting v22, then should be backported to a v21 patch release immediately following. (cherry picked from commit 6c45fb9)
Hi @aheejin I am working on Near blockchain sdk. Which uses WebAssembly, compiled from Rust to execute smart contracts. reference-types feature breaks the smart contract execution due to deserialization error. So the developers can't compile their smart contracts with new version of Rust compiler. One of the possible solutions is to disable reference-types. However, as it requires to build standard library as well, it forces to use nightly compiler. Which is not an option as developers usually build with stable compilers. I am wondering, is it possible to build standard library as well on stable compiler? |
@PolyProgrammist you might be interested in this post which explains more about this change and has a section about disabling features with respect to recompiling the standard library. Recompiling the standard library is not possible on stable Rust but you might be interested in this proposal for a new Rust target where the standard library is built with minimal features. |
@PolyProgrammist I'm a little unclear about what the issue is with the stable vs nightly compiler. Are there LLVM changes post v19 that fix a problem? If they are small enough, merging them to the 19 branch might be an option. |
The problem is that I don't see a way to disable reference-types feature on stable rust compiler starting from 1.82. |
Have you looked into other possible solutions here:
(Obviously being able to disable reference types is important, in general, but these two should also both address your issue). |
The problem turns out to be bigger than just the runtime mentioned here. Major JavaScript bundlers such as webpack can no longer handle anything produced by LLVM 19 (be it Rust, C, C++, ...). |
@CryZe posted more detail in another thread
My response was:
Interestingly, it looks like the decoder in webassemblyjs already decodes the table index as an LEB. |
Continuing from the LLVM PR: From what I've seen the Though I think this may probably be an easy enough PR that I can look into it (as opposed to implementing the full proposal right away). |
Yeah as far as I know, the only use of anything from the reference types proposal that you get by default with LLVM builds is the long encoding of the table index as described upthread. The link from my previous post goes (I think) to the part of the decoder that would have to be updated but it appears that it would already handle the LEB encoding (however, that was based on 5 minutes of looking, so I could well be wrong about that). So it would be good to check for more details. Also, could you maybe try it out with bulk-memory and nontrapping-fptoint enabled ( |
I already have all the following enabled and they work perfectly fine: And even the (Despite some of them having miscompilation issues in LLVM, but I fortunately have not encountered any miscompilations in my code for those) |
Turns out it's an unfortunate interaction where |
Interesting. AFAIK the feature section is really only useful for the linker (at least, I'm not aware of other uses of it). Maybe we should strip it out after linking? Or is this an object file that is being processed by webassemblyjs? |
I think it is also designed for post-link tools such as wasm-opt and wasm-bindgen to use, no? |
I guess so, if wasm-bindgen has a whole separate mode for generating bindings based on the features :) |
This sounds like it's WAI with respect to the features section. The features section has always been intended (by me, at least) to encode the features supported by the target engine rather than the features actually used by the module, so A short-term workaround followed by a real fix in |
In LLVM 19 the
reference-types
feature is being enabled by default in llvm/llvm-project#96584. This is having consequences in Rust - rust-lang/rust#128475 - and is requiring documentation to be written - rust-lang/rust#128511 - for how to disable features (docs that should be written anyway, but this update is becoming a strong forcing function).The main consequence of enabling
reference-types
is that the table index immediate in thecall_indirect
instruction is now being encoded as an overlong uleb which is 5 bytes long as80 80 80 80 00
. This overlong encoding of 0 has no semantic difference from whenreference-types
were disabled but validators and parsers which don't support reference types are rejecting these modules.I wanted to raise this issue here for awareness to confirm that this is expected fallout. I understand that enabling
refernece-types
by default is expected, but I wanted to additionally confirm that the consequences of actually emitting a breaking change into all modules usingcall_indirect
relative to parsers that don't supportreference-types
is expected. This seems like surprising behavior to me where modules that don't usereference-types
at all are now required to be executed in runtimes that support thereference-types
proposal.Or, alternatively, if there's an easy-ish way to shrink the leb encoding here (e.g. by assuming there's <= 127 tables and continuing to use a single byte) I think that'd personally be best to generate MVP modules as much as possible and only emit newer features when explicitly requested.
cc @sbc100, @aheejin, @tlively
The text was updated successfully, but these errors were encountered: