Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support allocator inside of runtime #61

Closed
wants to merge 9 commits into from

Conversation

yjhmelody
Copy link

@yjhmelody yjhmelody commented Dec 27, 2023

rendered

Related PR paritytech/polkadot-sdk#1658

BTW, I'm not a member of polkadot fellows, does this have any impact on RFC implementation?

Copy link
Contributor

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general not sure we need this with the existence of: #4

Comment on lines +47 to +51
```wat
(export "alloc" (func $alloc))
(export "dealloc" (func $dealloc))
(export "realloc" (func $realloc))
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need these with: #4

Copy link
Author

@yjhmelody yjhmelody Jan 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need these with: #4

It's used by entrypoint call(e.g. runtime api), right? But optional for host functions

@yjhmelody
Copy link
Author

yjhmelody commented Dec 27, 2023

I mentioned it in there already, it's basically orthogonal
@bkchr

@tomaka
Copy link
Contributor

tomaka commented Jan 2, 2024

I don't see how #4 is orthogonal. To me, it's one or the other, as they each render the other PR pointless. The only exception is we could do #61 plus some parts of #4 (in other words not all of #4).

From a pure logical point of view, I think that #4 is fundamentally a better idea because it doesn't complexify the flow of execution.
With what this PR proposes, the host will call a runtime function, then the runtime can call a host function, which can call a runtime function (alloc, realloc, or dealloc).
If alloc, realloc or dealloc lock a mutex that was already locked before the allocation/free, then you've got a deadlock.
If alloc, realloc or dealloc call a host function, then you might have a stack overflow.
You can counter-argue by saying that this is just a matter of being careful and that it won't happen in practice, but I think that it's still a point in favor of #4.

When it comes to performances, I have no idea which one of the two is better and I don't think it's possible to determine that without implementing them and comparing.

In my opinion, if either #61 or #4 (or #61 plus some parts of #4) is significantly faster than the other, then we should go for it, but if they compare more or less equal then we should go for #4.

Comment on lines +34 to +40
We define the spec as version 1, so the following `dummy` function `v1` MUST be exported to hint
client that runtime is using version 1 spec, otherwise rollback to `legacy` allocator.
The function should never be used, and its name is only for version checking.

```wat
(export "v1" (func $v1))
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we need to add this export when we could just detect whether alloc, realloc and dealloc are exported..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is a personal preference, if there is a v2 version, adding more features, then I always just check the version number to make sure that the subsequent logic is for that version number. Personally, I always hope that some specifications can be strongly bound to the version number.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like in this RFC, I named the original allocator legac, you can also call it v0 allocator. For other substrate implementations (smoldot or other language based substrate/polkadot), they can consider no longer supporting the old version, which is always clearer

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than listing RFC numbers in code comments, I prefer to use version numbers to indicate new features (I hate EIPs like that)

@yjhmelody
Copy link
Author

yjhmelody commented Jan 10, 2024

@tomaka

From a pure logical point of view, I think that #4 is fundamentally a better idea because it doesn't complexify the flow of execution.

I do not think so, it will never complexify the flow of execution, see the draft PR, it's very clear.

If alloc, realloc or dealloc lock a mutex that was already locked before the allocation/free, then you've got a deadlock.
Why there is a lock ? it's single thread. Even if there is a lock in wasm, this is a runtime logic bug, which will be encountered in runtime_apis, too. You can't prevent an exported function having a bug or falling into an infinite loop.

If alloc, realloc or dealloc call a host function, then you might have a stack overflow.

runtime_apis also wil encounter this problem.

You can counter-argue by saying that this is just a matter of being careful and that it won't happen in practice, but I think that it's still a point in favor of #4.

This is a matter of code practice, why don't you worry about runtime_api encountering such problems?

I don't see how #4 is orthogonal. To me, it's one or the other, as they each render the other PR pointless. The only exception is we could do #61 plus some parts of #4 (in other words not all of #4).

The reason why I think it is orthogonal is that

  • Some programming languages that support wasm are not convenient for importing allocator.
  • The signature style of the host function determines that it can be used for runtime side allocator/client side allocator/both.

If you care about performance, you have to compare before deciding whether to change.
If you care about ease of use/versatility, then you should probably consider whether both can be supported at the same time

The goals of this RFC are completely different from #4. I don't mind #4 moving forward. This RFC just provides more possibilities for #4 BTW.

@yjhmelody
Copy link
Author

What I am most concerned about with this PR is the versatility of the runtime, which helps the substrate runtime expand. Haven't you discovered that by combining the runtime_api.reimplmentation and allow missing host functions configurations and removing the imported allocator, the substrate runtime can be a pure wasm (actually no need to call any host function).
Whether it is integrating into other ecosystems or integrating other ecosystems into substrate, it is convenient.

@yjhmelody
Copy link
Author

yjhmelody commented Jan 10, 2024

Hi @radkomih I think you will be concerned about this RFC

@burdges
Copy link

burdges commented Jan 12, 2024

Around this and #4 ..

We'd ideally have the approximate goal that host calls should almost never pass owned data, only borrowed data like &'s [u8] and &'a mut [u8] buffers.

In principle, this requires all current host calls be replaced, not unlike #4 does, but realistically some hostcalls would still pass owned data if they really needed the owned data. We'd often permit -> Vec<u8> instead of passing an allocator callback for example, but we'd replace some by -> &'static [u8] host calls, and avoid -> Vec<u8> host calls in other ways.

We'd gradually gain more from this RFC as we replaced more host calls by borrowing flavors. Rust handles borrowed data nicest of course. Yet, other langauges could avoid inefficencies like call wrappers that take owned being the runtime allocator, copy it to owned data behind the system allocator, make the host call, and then allocate & copy in reverse. Instead, they'd make the hostcall directly on borrows of data managed by the runtime, either owned or borrowed.

In other words, I'd think this and changes like #4 should benefit each other.

@yjhmelody
Copy link
Author

yjhmelody commented Feb 7, 2024

@bkchr Hi, want to know your thinking.
I could continue to implement this RFC code if it's merged.

@yjhmelody
Copy link
Author

This allows the verification block to use pure wasm (just replace the host functions with wasm functions), which can extend the runtime verification to a lot of scenarios

@tomaka @bkchr

@koute
Copy link

koute commented Feb 23, 2024

Just my two cents: I'm in favor of #4 over this RFC, and I also strongly dislike the the fact that this would require us to call back into the runtime from a host function, which is a can of worms that once opened exposes us to fun things like the possibility of a malicious program of overflowing the host's stack, etc. I think that if possible we should keep the model of "host calls don't call back into the runtime".

As far as performance is concerned, #4 should be as fast or faster when optimally implemented (that is, if we guarantee that every host call will only have to cross the guest-host boundary twice at worst, and only once at the best).

@yjhmelody
Copy link
Author

yjhmelody commented Feb 23, 2024

and I also strongly dislike the the fact that this would require us to call back into the runtime from a host function, which is a can of worms that once opened exposes us to fun things like the possibility of a malicious program of overflowing the host's stack, etc.

@koute I still have the same point of view, this does not conflict with RFC #4, and all current runtime apis rely on the method you say you don't like.
Therefore, exporting alloc/dealloc will not increase this risk. The risk they face is the same as that of the common runtime API.

I would like to emphasize again that the focus of this RFC is to extend the substrate runtime capabilities, not simply to improve performance. It does not prevent the implementation of RFC#4, it just provides another possibility

@koute
Copy link

koute commented Feb 23, 2024

and all current runtime apis rely on the method you say you don't like.

Huh? But they don't call back into the guest? So can you clarify what you mean here?

@yjhmelody
Copy link
Author

yjhmelody commented Feb 23, 2024

Huh? But they don't call back into the guest? So can you clarify what you mean here?

I mean the api defined by sp_api::decl_runtime_apis!. Alloc/dealloc is not different with them except the call conversion.

@koute
Copy link

koute commented Feb 23, 2024

Huh? But they don't call back into the guest? So can you clarify what you mean here?

I mean the api defined by sp_api::decl_runtime_apis!. Alloc/dealloc is not different with them except the call conversion.

Huh? But they don't call back into the guest? So can you clarify what you mean here?

I means sp_api::decl_runtime_apis!. Alloc/dealloc is not different with them except the call conversion.

Sorry, I still don't know what you mean. (:

The way the host calls currently work is that they don't call back into the runtime. The runtime calls the host function, the host function does its thing, and the control returns back to the host. Yes, the host does call into the runtime initially so technically you have a chain of "host -> guest -> host", but that only happens once.

So for the sake of argument let's assume we implement this RFC. It will allow us to call the guest's allocator from the host. What is the use case here?

  1. Allocate the initial input payload to the runtime API. But if we implement Remove unnecessary host functions allocator usage #4 then this is unnecessary.
  2. Allocate the dynamically sized return values of the host calls. Ideally we do not want to do this anyway (because it will allow a recursive "host -> guest -> host -> guest -> host -> guest -> ..." call loop) and if we implement Remove unnecessary host functions allocator usage #4 then this is unnecessary (because no such host functions will exist anymore).

So I agree with @tomaka and I don't see how this RFC is orthogonal to #4; if we implement #4 then it pretty much makes this RFC pointless, unless you actually have an use case that could make use of this?

@yjhmelody
Copy link
Author

Supports a pure wasm environment to verify blocks. Currently, if you want pure wasm to verify blocks, you need to replace all host functions, but alloc itself cannot be replaced.
This reduces the requirements on the host environment and requires only a wasm executor, so the verification logic can be executed on many chains.

@koute
Copy link

koute commented Feb 23, 2024

Supports a pure wasm environment to verify blocks. Currently, if you want pure wasm to verify blocks, you need to replace all host functions, but alloc itself cannot be replaced. This reduces the requirements on the host environment and requires only a wasm executor, so the verification logic can be executed on many chains.

But this is also true after implementing #4 as far as I can see, because the host-side allocator will not be used at all anymore.

@yjhmelody
Copy link
Author

This doesn't seem to hold true in the case of legacy host functions.
But rfc #4 is indeed a good improvement

@koute
Copy link

koute commented Feb 23, 2024

This doesn't seem to hold true in the case of legacy host functions.

...which is exactly the same as this RFC, no? (:

Not that it matters, because the legacy host functions won't be used anymore for new blocks and will only be kept for sync compatibility.

@yjhmelody
Copy link
Author

yjhmelody commented Feb 23, 2024

Maybe you're right.
ext_input_size_version_1/ext_input_read_version_1 are more like a smart contract style design.
But my idea is to set up stubs(panics) for all import functions, but not actually call them, like parachain style.
I'm a bit convinced by you

@yjhmelody
Copy link
Author

Closed since RFC #4 will be implmented.

@yjhmelody yjhmelody closed this Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants