Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Give components their own file type #432

Open
spotandjake opened this issue Dec 21, 2024 · 6 comments
Open

Give components their own file type #432

spotandjake opened this issue Dec 21, 2024 · 6 comments

Comments

@spotandjake
Copy link

spotandjake commented Dec 21, 2024

The name of this issue might be a little narrow but I wanted to make it clear what I am really suggesting, currently component model is dissambiguated from regular wasm modules by a single layer byte in the first 8 bits which works but this makes it deeply integrated with wasm (which is a major point of controversy) and it is basically now just a second bytecode wasm supports which greatly increases the amount of work needed to be done by any tooling that wants to fully support wasm (toolchains like binaryen, runtimes, linkers). I think if we gave components their own identifier (file extension, mimetype, magic number) it would really help as you could either target a wasm component or a module. I think this also helps to re-enforce the idea that components are really a superset of core wasm. Additionally this makes it easier to search through documentation and for consumers to seperate the two layers.

I think this also makes sense once you start looking at wit vs wac where composition is a seperate language to the component type defintions themselves.

@vados-cosmonic
Copy link
Contributor

vados-cosmonic commented Dec 21, 2024

Just a note, but if you build a WebAssembly module versus a WebAssembly component you can tell the difference between them (standard tooling like file already supports this):

➜ file target/wasm32-unknown-unknown/debug/add.wasm
target/wasm32-unknown-unknown/debug/add.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)

➜ file target/wasm32-wasip1/debug/add.wasm
target/wasm32-wasip1/debug/add.wasm: WebAssembly (wasm) binary module version 0x1000d

To do so from a Rust environment you can use wasmparser::Parser::is_core_wasm and other associated methods there (you could investigate those functions to see exactly how the preambles differ.

@spotandjake
Copy link
Author

spotandjake commented Dec 21, 2024

@vados-cosmonic I mentioned that the bytecodes are distinguishable because of the layer byte which is why what you are mentioning can be done, what I am getting at is providing an actual seperation between the two bytecodes to better improve communication about what exactly the component model is, given the bytecode itself differs a lot it seems like it would make sense to give it a unique file extension, mimetype and magic number.

This doesn't have a lot of actual runtime benefits and is more about clarity and setting a proper boundary between components and core wasm itself.

@vados-cosmonic
Copy link
Contributor

vados-cosmonic commented Dec 21, 2024

Ah thanks for re-iterating what you wrote -- I think what I missed was:

it is basically now just a second bytecode wasm supports
...
I think if we gave components their own identifier (file extension, mimetype, magic number) it would really help as you could either target a wasm component or a module.

Supporting WebAssembly modules does not require supporting the Component Model, and the component model very much builds on top WebAssembly core -- I think it's hard to define them as "two bytecodes" rather than a superset, rather than you noted -- an extension of an existing bytecode.

I think I'm not imagining properly the use case which would benefit from a unique file extension/mimetype/magic number? Just trying to think through it -- regardless of which of those were chosen, when receiving data you're going to have to parse to figure out whether you can deal with the incoming blob (or file) or not, correct?

To help me understand, could you sketch more of what the ideal world would look like to you? What standards, tooling, etc would change?

@spotandjake
Copy link
Author

spotandjake commented Dec 21, 2024

It's more just about clutter, while runtimes are certainly not expected to support every proposal there is always community pressure to support them with regular proposals I think this is fine they are adding new features to core wasm, as component model is a super set that introduces a new bytecode on top of wasm I think it would make more sense to treat it as a separate thing entirely, this allows for more clarity when talking about the component model and when working with components directly (especially the file extension). This would also allow us to separate the runtimes entirely a wasm runner could provide both a component runtime and a wasm runner (which is basically what is already happening) as opposed to just having one standard interface for both. This is more of a small semantic change to improve the boundaries and make the different feature sets more apparent then it is about a feature addition.

Really the only changes to the standard would be removing the layer bit change, changing the magic number for components to something different and choosing a file extension and mimetype, tooling and runtimes wouldn't have to change apart from following this new spec, though tooling could provide seperate api's for components and regular wasm where it makes sense, for example components can't expose memory so a wasm runners api for a component wouldn't expose it whereas the regular runners api would and a shared api would stay the same.

@oovm
Copy link

oovm commented Dec 24, 2024

Can you explain why we need to distinguish magic numbers?

wasm-component is just a simple binary, and using .wasi directly will not be rejected by wasm-tools.

I use the .wasi suffix privately to prevent me from confusing artifacts, and I have not encountered any toolchain problems so far.

@spotandjake
Copy link
Author

Can you explain why we need to distinguish magic numbers?

wasm-component is just a simple binary, and using .wasi directly will not be rejected by wasm-tools.

I use the .wasi suffix privately to prevent me from confusing artifacts, and I have not encountered any toolchain problems so far.

As I noted before this isn't a need actionably file extensions and mimetypes are really just cosmetic the change is semantic as the component model is going to exist in a separate spec to core wasm and the bytecode between a component layer and regular module differ greatly I think it would make sense to reflect this in the file extension, Mimetype and magic number.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants