-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using wasm globals to represent data/function addresses in object files #153
Comments
Could be either an upside or a downside: This might motivate us to actually make linker relaxation work which could have other benefits. |
I particularly like that this would give data symbols names in the name section, but I am concerned about depending more on wasm-opt for code size and performance. Is it correct that implementing linker relaxation would essentially negate any simplicity benefits we would get from this? Also, what additional benefits would linker relaxation bring? |
Performance-wise I imagine that
Yes, I guess point (1) doesn't really hold up if we end up doing linker relaxation in response to this change. The other 3 arguments stronger than that one anyway I think. Regarding linker relaxation, one other example might be code that was compiled with TLS but then linked without It would imagine it could transform the following pattern/relocation:
Into just:
|
Well it would simplify the compiler, and move the complexity to the linker. It could still be a benefit if the e.g. compiler needed to have different codepaths in more than one place, whereas the linker can just have a different codepath in only one place? Not sure if that's actually the case though.
Traditional linker relaxation basically just rewrites code in place (leaving nop padding) rather than actually shrinking anything, presumably for speed and simplicity. Are we imagining that as a possibility too? Or are we already rewriting everything in the linker to get smaller LEBs? |
We do have an (off by default) option called |
So it sounds like there is no upside to (1), and that we don't need we will gain much from linker relaxation. But I think the other arguments for doing this still stand. |
At first glance, it looks like this should be compatible with module linking; does that sound right? Objdump has a -r option which can show the relocations interspersed with the disassembly, which is useful for other kinds of relocations as well. It seems like it wouldn't be bad if the other disassemblers people use could do this too. Like @dschuff I also wonder how much linker relaxation affects linking speed. A variant of this proposal would be to emit imports for globals, but continue to codegen addresses as |
I'm having trouble understanding what you are suggesting. Why emit imports for globals at all if we are going to generate |
It'd just arrange for all undefined symbols have imports, which I imagine would make it easier to think about things like |
Sorry, are you suggesting that, given and undefined external data symbol |
I'm still not sure what I think overall, but it does feel like there's a coherent design in this. It'd be a |
;tldr; should we use globals hold data addresses, even in non-PIC object files.
The current tooling conventions for object files (and the default used by llvm) is to represent data and functions address only in relocations.
Taking the address or a data symbol or function symbol results in the following code:
Where the reloc exists only in the linking section and is either a
R_WASM_MEMORY_ADDRESS_LEB
(for data symbols) orR_WASM_FUNCTION_INDEX_LEB
(for function symbols).With the experimental PIC ABI used by emscripten these address are instead model as wasm globals and produces the following pattern.
In this case the relocation type is
R_WASM_GLOBAL_INDEX_LEB
.When linking object built with
-fPIC
into static binaries that linker creates internal immutable globals that represent the static address of the symbol. Because the global is internal and immutablewasm-opt
can then completely eliminate the global and replace theglobal.get
with ani32.const
. This can be though of as form of linker relaxation that happen in the post-link optimizer. With a little work we could teach wasm-ld to perform this relaxation directly.My suggestion is to use globals in similar fashion by default and even when
-fPIC
is not specified. Here are some of the advantages, as I see them:It simplifies llvm, having just one way to get symbol addresses.-fPIC
object need no longer be special)wasm2wat
andwasmdis
, whereas this information was previously hidden in the relocation data.--allow-undefined
doesn't do what most people think it does WRT to data symbols.Downsides:
The text was updated successfully, but these errors were encountered: