Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Rust v0 symbol mangling scheme #474

Open
cerisier opened this issue May 8, 2024 · 5 comments
Open

Add support for Rust v0 symbol mangling scheme #474

cerisier opened this issue May 8, 2024 · 5 comments

Comments

@cerisier
Copy link

cerisier commented May 8, 2024

Rust has defined its own symbol mangling scheme (https://rust-lang.github.io/rfcs/2603-rust-symbol-name-mangling-v0.html) which is currently not supported since it differs completely from the C++ name mangling scheme speedscope already support.

Would you accept a Pull Request to bring this support to Speedscope ?

@cerisier
Copy link
Author

cerisier commented Jan 5, 2025

@jlfwong any thoughts on this ?

If I understand correctly, the current C++ demangling support is a javascript (non WASM) emscripten compiled version of __cxa_demangle most probably from the GNU libiberty library.

I see 2 options for supporting Rust symbol demangling:

  1. Using the pure C implementation of the rust_demangle function also available in the GNU libiberty library.
  2. Using the official rustc_demangle crate (WASM will be required tho)

Would you be interesting in a PR using this implementation ?

I was also wondering what you would think of targeting WASM instead, as today the __cxa_demangle was compiled using the standalone Javascript target (without WASM) which is not only less performant but also much bigger in terms of size. Would you be open to WASM modules for this (with all due documentation and compilation process) ?

@cerisier
Copy link
Author

cerisier commented Jan 5, 2025

Another potential solution would be to use a single demangling function compiled into Javascript or WASM that would be using llvm::demangle (the same way perfetto those) and that handles both C++ and Rust the right way.

@jlfwong
Copy link
Owner

jlfwong commented Jan 5, 2025

Hey @cerisier, I'm open to a PR that implements this.

Between the solutions you're describing, the thing I'd like to do is whatever...

  1. Minimizes additional dependencies placed on speedscope (dealing with bitrot is a huge maintenance nightmare, and I like that the current C++ demangling has zero dependencies). For example, I'm not excited about the idea of adding crate or rust compilation as part of speedscope's build toolchain.
  2. Minimizes bundle size. We already lazy load the C++ demangling code to make sure it doesn't block initial load, but I'd like to keep total code size you need to download before you can actually demangle C++ code as low as possible.

Right now the C++ demangling code is ~160kB. I added that a long time ago, so I suspect I was choosing the path of least resistance there, and adding a minified JS ball was just really easy. If adding a wasm module is significantly smaller than that, and could be done without adding a huge dependency chain, I'd be open to it.

Can you help me understand what the tradeoffs between the approaches you're describing are in terms of resulting code?

@cerisier
Copy link
Author

cerisier commented Jan 6, 2025

Thanks for your answer !

Hey @cerisier, I'm open to a PR that implements this.

Between the solutions you're describing, the thing I'd like to do is whatever...

  1. Minimizes additional dependencies placed on speedscope (dealing with bitrot is a huge maintenance nightmare, and I like that the current C++ demangling has zero dependencies). For example, I'm not excited about the idea of adding crate or rust compilation as part of speedscope's build toolchain.

Understood. That said, even if the crate was the way to go, we'd probably ship a single compiled JS blob.
As for the 0 dependencies, I'll ensure whatever solution I suggest follows this mantra.

  1. Minimizes bundle size. We already lazy load the C++ demangling code to make sure it doesn't block initial load, but I'd like to keep total code size you need to download before you can actually demangle C++ code as low as possible.

Understood, I'll suggest an approach that keeps optimizes for bundle size and if possible, adds nothing to the initial load, keeping everything lazy loaded and async.

Right now the C++ demangling code is ~160kB. I added that a long time ago, so I suspect I was choosing the path of least resistance there, and adding a minified JS ball was just really easy. If adding a wasm module is significantly smaller than that, and could be done without adding a huge dependency chain, I'd be open to it.

I'll dig more into that and suggest the best of the 2 options with regards to the 2 important points above.

Out of curiosity, do you remember the process you used to get to that JS blob ?

  1. emscripten command line arguments ?
  2. extra Javascript calling the codegen?
  3. minifier command to bundle everything into a single blob?

That would help me figure out the best approach to fit with the existing.

Can you help me understand what the tradeoffs between the approaches you're describing are in terms of resulting code?

Sure,

I believe there are several options possible:

  1. Use the exact same library and technique you've used so far (and bridge the pure C rust_demangle function of the GNU libiberty library. And compile that into a pure Javascript file. This is the safest approach and would only result in an increase of size of the existing Javascript blob since it would contain another function. (For this tho, It'd be helpful to have the process you used to get to the first one).
  2. Same as 1. except using the WASM backend of emscripten. This would bring an additional .wasm file which would need to be loaded asynchronously anyway (so fits well with the existing lazy load approach) and I believe would result in close to 50% code size reduction compared to the current pure Javascript version. There would still be a small Javascript blob but much smaller and only containing glue code to bridge with WASM.

I think it's probably best to forget the crate option because it would diverge from how we could expose __cxa_demangle which is written either in C or C++ and would need to combine both emscripten and a rust wasm toolchain like wasm-pack.

I was also thinking of documenting the process along the way so that further improvements are easier to make.

@jlfwong
Copy link
Owner

jlfwong commented Jan 6, 2025

That approach sounds good to me! Re: how the JS ball was made, I don't remember, but a bit of code archeology from #2 suggests https://github.com/nattofriends/c-filtjs which may have the actual build commands used.

The URL referenced in a comment in the code is dead, but the internet archive link should work: https://web.archive.org/web/20200808121129/https://d.fuqu.jp/c++filtjs/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants