-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(tree): safety check for all node types #422
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This goes in a good direction.
I left many comments asking for some changes, and also some tangent comments for @gballet.
@@ -608,8 +613,14 @@ func (n *InternalNode) Delete(key []byte, resolver NodeResolverFn) (bool, error) | |||
// signal that this node should be deleted | |||
// as well. | |||
for _, c := range n.children { | |||
if _, ok := c.(Empty); !ok { | |||
switch c.(type) { | |||
case *InternalNode, *LeafNode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't HashedNode
be part of the first case
? If HashedNode
it means there's an (unresolved) internal or leaf node. (i.e: so not part of the second case
but first).
case UnknownNode: | ||
panic(errUnknownNodeType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't panic here but return an error. This situation happening isn't a bug, but someone trying to operate in a state tree rebuilt from a proof.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that case is a valid tree. If anything, one should try to resolve the node.
default: | ||
panic(errNilNodeType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two points about this:
- I don't think we should panic, but return an error. This might be debatable since this should never happen. But lately we usually prefer to return an error so the node can shutdown gracefully instead of panicking.
- I think the error shouldn't be
errNilNodeType
but (the existing)errUnknownNodeType
. We usually adddefault:
like this to catch cases in the future were we add (for some reason) a new node type (e.g:OptimizedLeafNode
) and we forget to handle properly in these switches. In that case, we'd be returningerrNilNodeType
which wouldn't be correct. (Thenil
type could be interpreted as a special case oferrUnknownNodeType
)
case UnknownNode: | ||
panic(errUnknownNodeType) | ||
default: | ||
panic(errNilNodeType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar as before, I'd take UknownNode
and default
as errors, not panics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nil is also valid, although deprecated. feel free to put a warning in that case, but it's no cause for error or panic.
@@ -329,9 +329,14 @@ func (n *InternalNode) Children() []VerkleNode { | |||
|
|||
// SetChild *replaces* the child at the given index with the given node. | |||
func (n *InternalNode) SetChild(i int, c VerkleNode) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gballet, we should plan to remove this SetChild(...)
since it opens the door for external clients to nuke a lot of invariants on how the tree is constructed.
I've looked where this is used in geth, and is for the Verkle Tree iterator to "resolve" a HashedNode
and then call SetChild(...)
to put the resulting Internal/Leaf node. I believe we can resolve this in a simpler way than requiring this super powerful SetChild(...)
thing.
LMK your thoughts. This isn't urgent in anyway, but if makes sense we can track this by creating an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah ok, PRs welcome :)
return nil, nil, nil, err | ||
} | ||
n.children[i] = c | ||
case *InternalNode, *LeafNode, Empty, UnknownNode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhm, this UnknownNode
case sounds it should be an error.
@gballet, unfortunately, I think in L920 (below) the c.Commitment()
call would be incorrect. See here. An unknown node shouldn't have an empty commitment, that's not correct since we don't really know what the commitment is. (Can't assume is empty). I think we don't have other chance than panicking there. And here in this switch, separate UnkownNode
case in returning an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, or at least return an error. We don't want all the nodes on the network to panic. The error allows us to quit without destroying the db.
@@ -972,8 +988,14 @@ func (n *InternalNode) Serialize() ([]byte, error) { | |||
// Write the <bitlist>. | |||
bitlist := ret[internalBitlistOffset:internalCommitmentOffset] | |||
for i, c := range n.children { | |||
if _, ok := c.(Empty); !ok { | |||
switch c.(type) { | |||
case *InternalNode, *LeafNode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a mistake here, it should also include HashedNode
right? Since that signals a leaf or internal node is present. (Note that the previous code contemplated this case correctly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that was previously true, but now the serialization of an internal node is simply its commitment, so it doesn't matter.
case UnknownNode: | ||
panic(errUnknownNodeType) | ||
default: | ||
panic(errNilNodeType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return errors.
case UnknownNode: | ||
panic(errUnknownNodeType) | ||
default: | ||
panic(errNilNodeType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you switch to return an error by changing this API to also return an error?
I've checked and the caller already returns an error so it can bubble it without breaking APIs for clients.
case UnknownNode: | ||
panic(errUnknownNodeType) | ||
default: | ||
panic(errNilNodeType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry that it took me forever to review this. I don't know if you're still interested in pursuing this PR, the base branch has drifted a bit, but if you are then I'll be quicker to review it.
@@ -329,9 +329,14 @@ func (n *InternalNode) Children() []VerkleNode { | |||
|
|||
// SetChild *replaces* the child at the given index with the given node. | |||
func (n *InternalNode) SetChild(i int, c VerkleNode) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah ok, PRs welcome :)
break | ||
case Empty, HashedNode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are trying to figure out if a node is empty here, to see if it can be deleted. So Empty
should definitely not panic, it should continue looping. Same thing with HashedNode
, except that in this case it means that the node is not empty. So it should be part of the break
case like Ignacio says.
case UnknownNode: | ||
panic(errUnknownNodeType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that case is a valid tree. If anything, one should try to resolve the node.
case Empty, HashedNode: | ||
case UnknownNode: | ||
panic(errUnknownNodeType) | ||
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a nil
is a valid input, it should not panic. You are welcome to print a debug message to point out that this is deprecated, and that the code should be fixed, but panicking is too dangerous.
n.children[i] = HashedNode{} | ||
case Empty, HashedNode: | ||
case UnknownNode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't panic here!
- if the node is unknown or hashed, it means that it hasn't been resolved, so we don't need to resolve it and we can safely ignore it.
- if the node is empty, there is nothing to flush. This is fine, the tree is completely valid.
case UnknownNode: | ||
panic(errUnknownNodeType) | ||
default: | ||
panic(errNilNodeType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nil is also valid, although deprecated. feel free to put a warning in that case, but it's no cause for error or panic.
return nil, nil, nil, err | ||
} | ||
n.children[i] = c | ||
case *InternalNode, *LeafNode, Empty, UnknownNode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, or at least return an error. We don't want all the nodes on the network to panic. The error allows us to quit without destroying the db.
@@ -972,8 +988,14 @@ func (n *InternalNode) Serialize() ([]byte, error) { | |||
// Write the <bitlist>. | |||
bitlist := ret[internalBitlistOffset:internalCommitmentOffset] | |||
for i, c := range n.children { | |||
if _, ok := c.(Empty); !ok { | |||
switch c.(type) { | |||
case *InternalNode, *LeafNode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that was previously true, but now the serialization of an internal node is simply its commitment, so it doesn't matter.
if child == nil { | ||
panic(errNilNodeType) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log a warning, don't panic
This PR adds a safety check whenever looping through an
InternalNode
's children.There are 2 parts to this PR:
1. Nil check
From my understanding, an
InternalNode
's child should be either the following:InternalNode
,LeafNode
,Empty
,HashedNode
, orUnknownNode
. Hence, a child who isnil
is not accepted, and should be considered an abnormal behavior. Based on the current codebase, the only place that can change anInternalNode
's children tonil
is theSetChild
function. Hence, we should explicitly check for them and panic as soon as they are encountered.2. Node type switch
There are plenty of places that do something like
if _, ok := c.(Empty); !ok
, which implies that any node type other thanEmpty
is true. This isn't necessarily the true intention, because we also wouldn't want to deal withHashedNode
andUnknownNode
. Adding a node type switch ensures that we only want to deal with the correct node type (i.e. usuallyInternalNode
andLeafNode
only).