Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define IBC communication #49

Merged
merged 25 commits into from
Jun 16, 2023
Merged

Define IBC communication #49

merged 25 commits into from
Jun 16, 2023

Conversation

ethanfrey
Copy link
Collaborator

@ethanfrey ethanfrey commented Jun 6, 2023

Closes #7

  • Document different communication channels (control, rewards, slashing)
  • Define approach for reward payments
  • Rough description of how to handle slashing
  • Define IBC control message types
  • Define theoretical basis for serializability
  • Define validator set as CRDT
  • Add support for tendermint pubkey rotation
  • Define rules around valid staking transitions
  • Argue correctness of staking rules

docs/ibc/Staking.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved

For `Unstake`, we should update a local "unstaking" value on the `(user, validator)`
staking info, but not create a claim nor apply a diff to the validator.
We ensure that this "unstaking" amount can never be larger than the properly staked
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can never be larger than the properly staked ... value

I think in an edge case with concurrent slashing, this could be possible. But the operation should fail on the consumer side and play this back to the provider.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on this point?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consumer and. provider side can become out of sync on the staked amount when slashing happens.
Slashing is originated on the consumer side and reduces the effectively staked amount. It will take some time to propagate the event to the provider side. During this period both sides are out of sync and the real stake amount is < provider side stake amount.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that is a very good point. We need a much longer section on slashing. But I will include a space to dig into this point more.

docs/ibc/Staking.md Outdated Show resolved Hide resolved
@ethanfrey ethanfrey marked this pull request as ready for review June 6, 2023 16:30
pub enum ProviderMsg {
/// This should be called on initialization to get current list of validators.
/// Any changes to the set should be sent as a ConsumerMsg::UpdateValidatorSet
ListValidators {},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if we need to fetch and store a copy of the whole valset on the provider chain. With an optimistic approach, we would only store the validators that received delegates from the users (that were acked by the consumer side). That would reduce complexity in the system.
The frontend would need to read the full valset from the consumer chain directly. But this part is not clear to me.

I wrote a bit about validation in: osmosis-labs/mesh-security-sdk#28 (comment)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With an optimistic approach, we would only store the validators that received delegates from the users (that were acked by the consumer side). That would reduce complexity in the system.

You are right. We don't need the info except for ones with stake, so we can slash.

But after reading your issue on validation, I figured it was best to enforce the validator that would receive delegation before creating the packet. It's not dangerous to delegate to an invalid validator address, it will just return an error in ack. But the UI experience is much easier to show error in transaction (a few seconds after signing, when user is on the page), then error in ack (which may occur a minute or two later, maybe they closed the app already).

This is more for application design than enforcing correctness of the protocol.

Copy link
Collaborator

@maurolacy maurolacy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good docs / comments. Just some syntax / style comments.

@@ -89,4 +89,5 @@ Read more about the mechanics of the [Slashing Evidence Handling](./Slashing.md)

## Trust Assumptions

**TODO** list between the consumer and the provider
The [top-level introduction covers many concerns](../README.md#common-concerns)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move those this section? Separate common concerns from IBC specific ones?

docs/ibc/Rewards.md Outdated Show resolved Hide resolved
docs/ibc/Rewards.md Outdated Show resolved Hide resolved
docs/ibc/Rewards.md Outdated Show resolved Hide resolved
docs/ibc/Slashing.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
@ethanfrey
Copy link
Collaborator Author

Okay, I cleaned up the smaller issues here, maybe good starting point now.

I will make a new PR on top of this to dig deeper into correctness of messages and unordered/CRDT items

Copy link
Collaborator

@maurolacy maurolacy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review of Overview and Validators. Will continue during the day.

docs/ibc/Overview.md Show resolved Hide resolved
docs/ibc/Overview.md Show resolved Hide resolved
docs/ibc/Overview.md Outdated Show resolved Hide resolved
docs/ibc/Validators.md Outdated Show resolved Hide resolved
docs/ibc/Validators.md Outdated Show resolved Hide resolved
docs/ibc/Validators.md Outdated Show resolved Hide resolved
docs/ibc/Validators.md Outdated Show resolved Hide resolved
docs/ibc/Validators.md Outdated Show resolved Hide resolved
docs/ibc/Validators.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@maurolacy maurolacy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments / syntax suggestions. Will review Serializability next.

packages/apis/src/ibc.rs Show resolved Hide resolved
docs/ibc/ControlChannel.md Outdated Show resolved Hide resolved
docs/ibc/ControlChannel.md Outdated Show resolved Hide resolved
docs/ibc/ControlChannel.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
Comment on lines 54 to 79
A staking operation would have the following steps and checks:

* Send a lien from `vault` to `external-staking` contract
* Ensure there is sufficient collateral to cover max(lien)
* Ensure there is sufficient collateral in sum(potential slashes)
* Increase the lien for that given user on the `external-staking` contract
* Add a delegation in `external-staking` contract
* Increase stake count for (user, validator)
* Increase total stake on the validator
* Increase the user's shares in the reward distribution
* Send an IBC packet to inform the Consumer
* Guarantee we can commit all above on success
* Guarantee we can rollback all above on error

An unstaking operation would have the following steps and checks:

* Remove a delegation in `external-staking` contract
* Ensure stake count (user, validator) is set and greater than desired unstake amount
* Ensure total stake on the validator is set and greater than desired unstake amount (should always be true if above is true)
* Decrease stake count for (user, validator)
* Decrease total stake on the validator
* Decrease the user's shares in the reward distribution
* Add an entry to the (user, validator) "pending unbonding" to withdraw said tokens after the unbonding period.
* Send an IBC packet to inform the Consumer
* Guarantee we can commit all above on success
* Guarantee we can rollback all above on error
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but we must be really sure we're not missing some scenario / combination of operations here.

Perhaps a simpler initial approach could be, to allow for only one in-flight operation at any given time, and reject new operations until the pending one is completed, either successfully or with error.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the locking approach.

We can't just reject other IBC packets, but any other transaction that touches the same keys.
That means, we cannot pay out rewards to that validator or make/remove another lien from that user to anyone.

That is correct, but not efficient. Please read discussion in Serializability and we take this up.

Copy link
Collaborator

@maurolacy maurolacy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More syntax suggestions and comments.

docs/ibc/Serializability.md Outdated Show resolved Hide resolved
docs/ibc/Serializability.md Outdated Show resolved Hide resolved
docs/ibc/Serializability.md Show resolved Hide resolved
docs/ibc/Serializability.md Outdated Show resolved Hide resolved
docs/ibc/Serializability.md Outdated Show resolved Hide resolved
docs/ibc/Serializability.md Outdated Show resolved Hide resolved
docs/ibc/Serializability.md Outdated Show resolved Hide resolved
docs/ibc/Serializability.md Outdated Show resolved Hide resolved
docs/ibc/Serializability.md Outdated Show resolved Hide resolved
Comment on lines 209 to 228
### Value range

One idea to implement this would be to not just store one value (the balance), but a range of values, covering possible values if
all in-flight transactions were to succeed or fail. Normal transactions (send 100 tokens) would update both values and error if either
one violated some invariant (too high or too low). When an IBC transaction is initiated, it would execute eg. "maybe 200", which would
attempt to decrease the min by 200 but leave the max unchanged (requiring this range to remain valid).

This approach would only work by values not used by complex forumulas, such as balances of an AMM (we can't calculate prices off ranges),
or paying out staking rewards (the value received by all other users depends on your stake, and we can't propogate this to all those accounts,
as it would be prohibitively expensive to collapse them all when the transaction is committed or rolled-back).

But for counters with comparisons, increments, and decrements, it could work well. Even enforcing logic like
"total amount of collateral is greater than the max lien amount" could be enforced if collateral and lien amounts
were value ranges. In this case, the max of (`lien.max`) would be compared against `collateral.min`.
With some clever reasoning, we could possibly enforce such value ranges without actually storing multiple fields.
Copy link
Collaborator

@maurolacy maurolacy Jun 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good idea.

One concern I have is that value ranges will need to be cumulative. That is, a value range (or at least, part of it) of a given transaction would depend on the value range of the other un-finished transaction(s).

And if I'm not mistaken, this introduces the issue of ordering again. For each transaction, we would need a set of value ranges, reflecting each of the possible ways the sequence of current pending transactions (including this one) can finish. I mean, not only if they succeed or fail, but also, all of the different orderings or sequences in which they can succeed / fail.

And then we need to check if any new transaction "passes". That is, if the (updated) set of value ranges of all the pending txs, including the new one, is or will still be valid.

The value range set cardinality is O(N!) in the number of pending txs. So, only feasible for a small number of pending txs. This is not a big deal, as we can reject txs upfront if a given threshold (5 or so) of pending txs is reached.

The set will also need to be maintained / updated not only when a new tx arrives, but also when a pending one finishes.

Will think more about it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need a huge set, as it is just min, max for the range.

  • Range(100, 100) + Maybe(+200) + Maybe(-50) => Range(50, 300)
  • Range(50, 300) + Rollback(+200) => Range(50, 100)
  • Range(50, 100) + Commit(-50) => Range(50, 50)

As I went further, I think I would use these ranges mostly as theoretical constructs to evaluate and then use locks if needed.

All queries accessing the said value would need to be range-aware, which is more complex than making them lock-aware. A lock is basically "get access or error". A range means all operations (sum, comparison, etc) would need to check multiple values.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But yeah, no need for O(N!)

But maybe a good idea to have a set of (caller, Maybe) for all pending tx to validate if Rollback or Commit can be safely called.

We can keep this a tool for analysis for now and consider actual Rust implementations if needed in the protocols.

Copy link
Collaborator

@maurolacy maurolacy Jun 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about

  • Range(100, 100) + Maybe(+200) + Maybe(-150) => Range(-50, 300)

?

That would fail in one order, but succeed in the other.

Oh, I see now that your approach makes sense, as that simple approach will guarantee that all possible orderings succeed (which is what I was going after in my comment above).

Copy link
Collaborator

@maurolacy maurolacy Jun 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whereas

  • Range(100, 100) + Maybe(-50) + Maybe(-60) => Range(-10, 100)

will fail. Nice.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Balances and similar will also be undefined during pending transactions. They will be value ranges, basically.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, adding the Maybe(-150) would fail, as that would enter a non-commutative state (where there exist different orderings that would provide different results)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I started #52 to implement the read/write lock primitives as actual structs that can be used in cw-storage-plus.

If you feel inspired to start modeling value ranges, feel free. (Just let me know before you start coding in case I grab it)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #53 Man, it feels good to code again...

validator, which iterates over all liens to find the sum of max slashing, will be blocked until the first transaction completes.
We cannot actually block (or wait) transactions in this way, so we must return an error for the second transaction.

### Idea: Approximating Locks
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main question here is if we can consider this optimization safe?
Or if we should use a full locking operation here as well?

I'd love some feedback on that

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't review now, but will take a look later for sure.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alpe Your comments above about optimisitic liens are the topic of this section.

Please read it and let me know how it can be improved.

Copy link
Collaborator

@alpe alpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not had a chance to read all docs but here is some feedback already

In order to make the protocol documents more compact,
[all theoretical foundations are described separately](./Serializability.md).
Please read through that document and have a decent grasp of those concepts before
digging into the sub-protocols below.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for helping navigating through the documents 👍

acks on the sending chain must be also be A, B, C). This is exactly the guarantee that ordered channels provide.

On top of this, we have to ensure that no other transactions conflict with any open IBC transactions. Transaction A is "open" from the
time the first logic is run on the sending chain (which will send the IBC message) until the ack is fully process and committed on
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
time the first logic is run on the sending chain (which will send the IBC message) until the ack is fully process and committed on
time the first logic is run on the sending chain (which will send the IBC message) until the ack is fully processed and committed on

2. Process Packet on Receiving Chain: Aquire all read/write locks on all data, process data, release all locks. This goes from the "growing" phase to the "shrinking" phase.
3. Process Ack on the Sending Chain: Process ack, and release all locks. This is the "shrinking" phase.

If we guarantee that we never read any data that is not under a write lock, we can release the read locks at end of "start tx", as they are not needed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence was a bit hard to understand for me. It helps to read the examples below. I got confused by the motivation for write locks on any data "read only".

some global invariants before initiating a transaction. However, since all changes to the given data is initiated in the provider
blockchain, and we have a complete view of currently in-flight transactions, this approach could work for IBC protocols.

We can say that ICS20 implementation does something like this. As the only invairant is that the value never goes below zero,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We can say that ICS20 implementation does something like this. As the only invairant is that the value never goes below zero,
We can say that ICS20 implementation does something like this. As the only invariant is that the value never goes below zero,

- else `D[V] -= X`, return success ack

This is a pretty straightforward counter with a lower bound of 0, along with an increment and decrement
counters.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good use of sharding by validator!

I was thinking that we technically could even relax the counter constraint of 0 to become negative on the consumer side.
If I have D(a, 100), U(a,50), U(a,50), the result would eventually become 0 in any order. A negative value should just be stored and not trigger any action.
This builds upon the assumption that the provider side keeps a balance of in flight data and never "over undelegates".

On a different view, this is on a validator and not user shard, therefore a buffer will be built up quickly. A hard 0 constraint can act as a circuit breaker against draining as well.

Copy link
Collaborator Author

@ethanfrey ethanfrey Jun 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very hesitant to relax invariants ever, and start handling negative numbers when an epoch hits.

The provider should not even allow the U(a, 50) action to be initiated on that provider side, until it is sure the D(a, 100) is committed. What happens if the first one fails for some reason?

With full CRDTs, we assume nothing will ever be rejected, and all operations will be applied.

With this distributed transaction model, we need to ensure consistency. Minimally, no transaction can read data from an in-flight transaction - meaning the delegation doesn't even start on the provider side until it is fully committed (on ack not on send packet)


Note that a write-lock will prevent reading of the value from any other transaction. That means the same user offering a lien on another
validator, which iterates over all liens to find the sum of max slashing, will be blocked until the first transaction completes.
We cannot actually block (or wait) transactions in this way, so we must return an error for the second transaction.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that you can avoid failing the second TX by maintaining state for in flight data in both contracts.
Like credit-card companies, you could place a lock in the vault for the lien amount, that is released and the change applied on success Ack or just dropped on error Ack. The liquid balance would be reduced by the locked amounts.
Similar in external-staking, where the changes are applied only on success Ack. Or reverted on error Ack. Depending at which stage they are taken into account in the contract

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I don't explan well in the document. The concept is:

  • I start a staking action over IBC, it is not committed, we don't actually charge the lien.
  • While in-flight, I withdraw collateral such that it is valid with the pre-stake lien, but invalid with the post-stake lien

We could leave the second tx in a pending state over multiple blocks til the ack from the first point comes in, but this starts breaking all kinds of assumptions about non-ibc messages succeeding or failing atomically.

The other approach is just to reject it as it might fail, which the write lock would do. Note, that a write-lock would reject any possible action that uses said value.

(Actually now going through that case, I think we store individual liens separate from combined user info (collateral, max_lien, total_slashable. We would have to lock both of those items, as a commit would write both of them)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may not have the full picture, yet but with my current understanding, the "withdraw collateral" should fail instead. The "stake" action was started before and the vault should reflect that state in the balance. With an optimistic approach, the lien should be taken on start of the "stake" process already. It can be returned on error Ack or timeout.

I do understand the "write-lock" as "one operation only until finished". That constraint would be very hard and can be avoided IMHO. It should be possible to "withdraw collateral" and having a "stake" operation in flight as long as the balance in the vault is sufficient for both.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the locking approach is the easiest one to prove safe.

And from there, I look to reduce constraints, while showing that this can never open up a conflict, assuming the previous version had no conflicts.

I think you are not looking at the latest version, which has a section Idea: Approximating Locks discussing just what you are suggesting (and asking for feedback that this is safe as a lock)

@ethanfrey ethanfrey mentioned this pull request Jun 9, 2023
docs/ibc/Validators.md Outdated Show resolved Hide resolved
@ethanfrey ethanfrey mentioned this pull request Jun 9, 2023
Copy link
Collaborator

@maurolacy maurolacy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some syntax suggestions, and a question about max cap enforcement.

docs/ibc/Staking.md Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
docs/ibc/Staking.md Outdated Show resolved Hide resolved
@maurolacy maurolacy mentioned this pull request Jun 13, 2023
@ethanfrey ethanfrey merged commit c1dbec6 into main Jun 16, 2023
@ethanfrey ethanfrey deleted the define-ibc branch June 16, 2023 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Finalize documentation
4 participants