Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
COW in LeafNodes #314
base: master
Are you sure you want to change the base?
COW in LeafNodes #314
Changes from all commits
eca7921
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so this is the "main" change of this second version of the PR.
The idea is that we do the "COW" idea in
LeafNodes
.If we've never calculated a
(*LeafNode).commitment
, whenever we do(*LeafNode).Commit()
we can do the same as we did before. Taking all the slice ofvalues
, and calculate the commitment.But, if we already have a
(*LeafNode).commitment != nil
, then every time we touch that leaf node, we keep track in(*LeafNode).cow
which was the previous value, and update(*LeafNode).values
with the new value. Whenever the user callsCommit(...)
, the code will notice that we already had a previous(*LeafNode).commitment != nil
and a(*LeafNode).cow
with some tracked changes. Then, we do the diff-updating.In summary, instead of doing diff-updating as soon as the user calls
Delete(...)
orInsert*(...)
which previously did the diff-updating straight; we now only keep track in(*LeafNode).cow
and only do the real diff-updating whenCommit()
is called.So if the client is updating multiple times the same leaf value, we avoid doing diff-updatings that are overwritten. We only do it once when
Commit()
is called.Also, this centralized the logic of
*LeafNode
commitment updating in a single place:Commit()
. (i.e: we don't have commitment calculations spread inNewLeafNode(..)
,Insert*(...)
, andDelete(...)
. There's a single place where that work happens.This changes the previous version of this PR pattern
if n.commitment == nil { n.Commit() }
that we wanted to change.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ^ is the complete picture of how COW works for leaf nodes. The idea of the comment is to give you the full picture. I'll dive into some extra details about where stuff happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you remove the commitment calculation, shouldn't the
cow
field be initialized with values at this point?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because
commitment
isnil
in this case. In that case, theCommit()
function already sees that as a signal that there's no previous commitment that we can do diff-updating; so it will do the usual full computation in the vector.That means we don't need to initialize
cow
here.cow
is initialized whenever we have a previous commitment calculation, and any value of the leaf is changed. In that case, we track incow
since will do a diff-update in the nextCommit()
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As part of the leaf COW change,
updateCn
now works slightly different.Instead of receiving the new value, and getting the old from
n.values
, it does the opposite.It receives the old value (now from
cow
), and uses the new values which were already stored inn.values
.So the logic is the same, the only thing that changes is where we get the old and new points.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, here's one half of the meat of leaf COWing.
We remove the current diff-updating work, so no heavy stuff is done when VKT APIs are called (e.g: insert, delete, etc). We only will do work in
Commit()
(to be explained below).What we do now here, is what was explained in the TL;DR before:
n.commitment == nil
, we don't create the COW map since there aren't any meaningful "previous values" to track. It isn't adding anything, and we can avoid allocating a map with "nil" previous entries.n.commitment != nil
, that means we already did a previousCommit(..)
in this leaf node. So we create the COW, and keep track of the previous values and insert the new one invalues
. This is the information we need to track to allowCommit(...)
do the diff updating.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This
updateMultipleLeaves(...)
got simplified, so this logic disappeared.Updating multiple leaves is simply calling
updateLeaf(...)
explained before (i.e: tracking value changes).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so here we have the new version of
(*LeafNode).Commit()
.Here we do what we explained in the TL;DR before, but I'll repeat a bit in further comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As explained in the comment I added, if this leaf node never was
Commit(..)
ed before, we do the known logic to commit to all theleaf.values
in the polynomial commitment.We could in theory only do diff-updating starting from a "empty" commitment and do diff-updates on top. But that's quite slow... if we have a bunch of values that are all new (since we have never Commited before), we do all the calculation in a single poly commitment. The code of this "if" block is the same as the previous version of this PR, so nothing interesting is needed for reviewing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could that not be simplified by simply setting the commitment to zero and then applying the CoW logic below to it? This should be equivalent, unless I missed something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I went with that approach initially since it would avoid an "if" case here. Unfortunately, that was slower.
If you have a calculated commitment and 10 values to update with diffs-commits, you have to do 10 diff commitment which is slower than doing a single 10-values polynomial commitment.
In fact, at some point if your cow is showing that you changed 256 values, probably we shouldn't be doing diff-updating too. (Under the same logic) Might be useful to try to discover when diff-updating is slower than doing the full calculation again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we have the new case. In this case we know that
leaf.commitment != nil
(so we already didCommit(...)
in this leaf node), and we see that there're values inleaf.cow
which means that values where changed.What we do here is something similar to what we did before for diff updating. The only "meaningful" change is that we send
oldValue
toupdateCn(...)
instead of new values (as explained in a previous comment). This is only because inleaf.values
we have the new values, and the old ones inleaf.cow
.In L1143 we cleanup the
leaf.cow
map to reclaim some memory, and allow that to be created again if new values in this leaf change again.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of doing
GetConfig()
in each test, we can make this part of the startup process before running the tests, so we don't have to repeat ourselves.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that is necessary: once the first test is called, it will initialize the
cfg
variable for all subsequent tests. Not sure what you are trying to achieve here?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoiding adding
GetConfig()
in each test.You can't rely on only doing that in the first test of the file, because if you later do
go test ./... -run=TestSomeParticularOne
, that will panic if isn't the first one in the file. So that means you have to doGetConfig()
on every test, which can be avoiding on doing this inTestMain()
.Makes sense to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's quite important now to always do
Commit()
after doing changes, since we're trying to be "as lazy as possible" to do premature work if values keep changing before being ready to commit.