COW in LeafNodes #314

jsign · 2022-12-13T23:27:21Z

This PR contains multiple performance improvements that will be explained in PR comments.

These changes were motivated by optimizing the benchmarks in this go-ethereum draft PR. It's recommended to see that PR first so you have a top-down approach to understanding the changes.

Of course, this had a noticeable impact on the benchmarks of this repo too:

name                                        old time/op    new time/op    delta
ProofCalculation-16                            1.87s ± 2%     0.08s ± 3%  -95.75%  (p=0.003 n=5+7)
ProofVerification-16                          26.9ms ± 2%    26.5ms ± 1%     ~     (p=0.171 n=5+8)
GroupToField/single-16                        1.53µs ± 2%    1.45µs ± 2%   -5.08%  (p=0.002 n=5+8)
GroupToField/multiple/1-16                    1.61µs ± 4%    1.57µs ± 3%   -2.94%  (p=0.048 n=5+7)
GroupToField/multiple/2-16                    1.83µs ± 4%    1.73µs ± 1%   -5.24%  (p=0.003 n=5+7)
GroupToField/multiple/4-16                    2.53µs ± 2%    2.38µs ± 3%   -5.88%  (p=0.002 n=5+8)
GroupToField/multiple/8-16                    3.96µs ± 4%    3.75µs ± 2%   -5.36%  (p=0.002 n=5+8)
GroupToField/multiple/16-16                   6.26µs ± 3%    6.01µs ± 1%   -4.10%  (p=0.003 n=5+8)
GroupToField/multiple/32-16                   11.2µs ± 4%    10.9µs ± 2%   -2.55%  (p=0.006 n=5+8)
GroupToField/multiple/64-16                   21.5µs ± 3%    20.9µs ± 2%   -2.99%  (p=0.006 n=5+8)
GroupToField/multiple/128-16                  40.2µs ± 2%    39.3µs ± 0%   -2.29%  (p=0.003 n=5+7)
GroupToField/multiple/256-16                  79.8µs ± 4%    77.8µs ± 1%   -2.54%  (p=0.010 n=5+7)
CommitLeaves/insert/leaves/1000-16            47.2ms ± 2%    11.9ms ± 3%  -74.71%  (p=0.002 n=5+8)
CommitLeaves/insertOrdered/leaves/1000-16     66.2ms ± 3%    66.5ms ± 4%     ~     (p=0.833 n=5+8)
CommitLeaves/insert/leaves/10000-16            471ms ± 2%      65ms ± 5%  -86.19%  (p=0.002 n=5+8)
CommitLeaves/insertOrdered/leaves/10000-16     626ms ± 1%     830ms ± 2%  +32.68%  (p=0.002 n=5+8)
CommitFullNode-16                             6.30ms ± 5%    5.12ms ± 2%  -18.66%  (p=0.003 n=5+7)
ModifyLeaves-16                                380ms ± 4%     169ms ± 3%  -55.40%  (p=0.002 n=5+8)

Note: this PR is optimistically sitting on top of another branch of go-ipa that has other optimizations. I'll update the go.mod to the official commit when that gets merged. Only after we do that, the replay block GH action will run correctly for go.mod editing reasons.

config_ipa.go

encoding.go

go.mod

proof_test.go

tree.go

gballet

Very nice. I'm mostly behind these changes, although I'd rather have a cow approach for the commitment calculation in LeafNode than what you did here with the nil commitments. I believe this will lead to even higher performance.

proof_test.go

go.mod

encoding.go

config_ipa.go

tree.go

jsign

@gballet, thanks for the review!

For completeness in this PR discussion, we've discussed a bit and decided to wait for #295 to be merged first. After that, I'll re-run the benchmarks, and we can plan on keep moving forward with this PR.

jsign · 2023-01-11T20:19:57Z

tree_test.go

+func TestMain(m *testing.M) {
+	_ = GetConfig()
+	os.Exit(m.Run())
+}


Instead of doing GetConfig() in each test, we can make this part of the startup process before running the tests, so we don't have to repeat ourselves.

I don't think that is necessary: once the first test is called, it will initialize the cfg variable for all subsequent tests. Not sure what you are trying to achieve here?

Avoiding adding GetConfig() in each test.

You can't rely on only doing that in the first test of the file, because if you later do go test ./... -run=TestSomeParticularOne, that will panic if isn't the first one in the file. So that means you have to do GetConfig() on every test, which can be avoiding on doing this in TestMain().

Makes sense to you?

jsign · 2023-01-11T20:20:47Z

tree_test.go

 	root := New()
 	root.Insert(ffx32KeyTest, zeroKeyTest, nil)
+	root.Commit()


It's quite important now to always do Commit() after doing changes, since we're trying to be "as lazy as possible" to do premature work if values keep changing before being ready to commit.

jsign · 2023-01-11T20:21:56Z

tree.go

-// Committer represents an object that is able to create the
-// commitment to a polynomial.
-type Committer interface {
-	CommitToPoly([]Fr, int) *Point
-}


Interestingly, this interface isn't used anymore. This isn't because changes of this PR. If we delete this interface in current master, all compiles and work the same.

yes, go-verkle used to support two schemes and one of them got dropped. I guess we can remove it, but in the interest of keeping a clean git history, I would appreciate if you could make this another PR.

jsign · 2023-01-11T20:27:53Z

tree.go

@@ -187,6 +183,7 @@ type (

 		commitment *Point
 		c1, c2     *Point
+		cow        map[byte][]byte


OK, so this is the "main" change of this second version of the PR.
The idea is that we do the "COW" idea in LeafNodes.

If we've never calculated a (*LeafNode).commitment, whenever we do (*LeafNode).Commit() we can do the same as we did before. Taking all the slice of values, and calculate the commitment.

But, if we already have a (*LeafNode).commitment != nil, then every time we touch that leaf node, we keep track in (*LeafNode).cow which was the previous value, and update (*LeafNode).values with the new value. Whenever the user calls Commit(...), the code will notice that we already had a previous (*LeafNode).commitment != nil and a (*LeafNode).cow with some tracked changes. Then, we do the diff-updating.

In summary, instead of doing diff-updating as soon as the user calls Delete(...) or Insert*(...) which previously did the diff-updating straight; we now only keep track in (*LeafNode).cow and only do the real diff-updating when Commit() is called.

So if the client is updating multiple times the same leaf value, we avoid doing diff-updatings that are overwritten. We only do it once when Commit() is called.

Also, this centralized the logic of *LeafNode commitment updating in a single place: Commit(). (i.e: we don't have commitment calculations spread in NewLeafNode(..), Insert*(...) , and Delete(...). There's a single place where that work happens.

This changes the previous version of this PR pattern if n.commitment == nil { n.Commit() } that we wanted to change.

The ^ is the complete picture of how COW works for leaf nodes. The idea of the comment is to give you the full picture. I'll dive into some extra details about where stuff happens.

jsign · 2023-01-11T20:29:12Z

tree.go

+	// If we're at the root internal node, we fire goroutines exploiting
+	// available cores. Those goroutines will recursively take care of downstream
+	// layers, so we avoid creating too many goroutines.


I added this extra comment to explain this idea of n.depth == 0. Doesn't hurt to be a bit more explicit for future readers.

The idea is interesting, although I still need to think it over to convince myself that it will have an impact. I doubt that will be the case for InsertOrdered, for example, since by the time a depth of 0 is reached, most of the tree will have already been flushed.

It is an idea that, I think, would be great to implemen in FlushAtDepth, btw.

It is however very interesting when computing the root during block execution.

So all in all, I'm in favor of this, but just as before, I would appreciate if this was done in a separate PR, so that we can check its contribution independently, and also keep a somewhat cleaner git history.

jsign · 2023-01-11T20:42:06Z

tree.go

-	var c1, c2 *Point
-	var old1, old2 *Fr
-	for i, v := range values {
-		if len(v) != 0 && !bytes.Equal(v, n.values[i]) {
-			if i < 128 {
-				if c1 == nil {
-					c1, old1 = n.getOldCn(byte(i))
-				}
-				n.updateCn(byte(i), v, c1)
-			} else {
-				if c2 == nil {
-					c2, old2 = n.getOldCn(byte(i))
-				}
-				n.updateCn(byte(i), v, c2)
-			}
-
-			n.values[i] = v
+	for i := range values {
+		if values[i] != nil {
+			n.updateLeaf(byte(i), values[i])
 		}
 	}
-
-	if c1 != nil {
-		n.updateC(0, c1, old1)
-	}
-	if c2 != nil {
-		n.updateC(128, c2, old2)
-	}
 }


This updateMultipleLeaves(...) got simplified, so this logic disappeared.
Updating multiple leaves is simply calling updateLeaf(...) explained before (i.e: tracking value changes).

jsign · 2023-01-11T20:42:38Z

tree.go

+	},
+}
+
+func (leaf *LeafNode) Commit() *Point {


OK, so here we have the new version of (*LeafNode).Commit().
Here we do what we explained in the TL;DR before, but I'll repeat a bit in further comments.

jsign · 2023-01-11T20:44:21Z

tree.go

+	// If we've never calculated a commitment for this leaf node, we calculate the commitment
+	// in a single shot considering all the values.
+	if leaf.commitment == nil {


As explained in the comment I added, if this leaf node never was Commit(..)ed before, we do the known logic to commit to all the leaf.values in the polynomial commitment.

We could in theory only do diff-updating starting from a "empty" commitment and do diff-updates on top. But that's quite slow... if we have a bunch of values that are all new (since we have never Commited before), we do all the calculation in a single poly commitment. The code of this "if" block is the same as the previous version of this PR, so nothing interesting is needed for reviewing.

could that not be simplified by simply setting the commitment to zero and then applying the CoW logic below to it? This should be equivalent, unless I missed something?

Yes, I went with that approach initially since it would avoid an "if" case here. Unfortunately, that was slower.

If you have a calculated commitment and 10 values to update with diffs-commits, you have to do 10 diff commitment which is slower than doing a single 10-values polynomial commitment.

In fact, at some point if your cow is showing that you changed 256 values, probably we shouldn't be doing diff-updating too. (Under the same logic) Might be useful to try to discover when diff-updating is slower than doing the full calculation again.

jsign · 2023-01-11T20:46:35Z

tree.go

+	} else if len(leaf.cow) != 0 {
+		// If we've already have a calculated commitment, and there're touched leaf values, we do a diff update.
+		var c1, c2 *Point
+		var old1, old2 *Fr
+		for i, oldValue := range leaf.cow {
+			if !bytes.Equal(oldValue, leaf.values[i]) {
+				if i < 128 {
+					if c1 == nil {
+						c1, old1 = leaf.getOldCn(i)
+					}
+					leaf.updateCn(i, oldValue, c1)
+				} else {
+					if c2 == nil {
+						c2, old2 = leaf.getOldCn(i)
+					}
+					leaf.updateCn(i, oldValue, c2)
+				}
+			}
+		}
+
+		if c1 != nil {
+			leaf.updateC(0, c1, old1)
+		}
+		if c2 != nil {
+			leaf.updateC(128, c2, old2)
+		}
+		leaf.cow = nil


Now we have the new case. In this case we know that leaf.commitment != nil (so we already did Commit(...) in this leaf node), and we see that there're values in leaf.cow which means that values where changed.

What we do here is something similar to what we did before for diff updating. The only "meaningful" change is that we send oldValue toupdateCn(...) instead of new values (as explained in a previous comment). This is only because in leaf.values we have the new values, and the old ones in leaf.cow.

In L1143 we cleanup the leaf.cow map to reclaim some memory, and allow that to be created again if new values in this leaf change again.

jsign · 2023-01-11T20:48:52Z

tree.go

+	// The length of the result is at a minimum the sum of:
+	// 1 = leafRLPType
+	// 31 = length of n.stem
+	// 32 = length of bitlist
+	// 2*32 = c1bytes and c2bytes
+	// And we add 4*32 bytes of extra capacity for an ~expected 4 children.
+	result := make([]byte, 1+31+32+2*32, 1+31+32+2*32+4*32)
+	result[0] = leafRLPType
+	copy(result[1:], n.stem[:31])
+	copy(result[1+31:], bitlist[:])
+	copy(result[1+31+32:], c1bytes[:])
+	copy(result[1+31+32+32:], c2bytes[:])
+	result = append(result, children...)
+	return result, nil


This is a change that already existed in the previous version of this PR that avoids a lot of unnecessary capacity by the previous append(append(... pattern.

But, after rebasing to the latest master I had to fix it adding this extra c1bytes and c2bytes that are now part of the serialization (probably makes sense to you because you did that addition). The idea is the same, jus that some extra space is specifically allocated for that new stuff.

Also, in L1347-L1352 I added some comment to explain what's the idea here. I think without the comment was somewhat obvious that we're trying to allocate as close as possible to the space we need. But... it doesn't hurt to explain where const-numbers come from.

very nice, I also would like this to be a separate PR, though.

gballet

Thank you, that looks much better. The issue is that it doesn't seems to speed things up during the block replay... It makes no sense to me, but it's a fact.

I suggest breaking all the optimizations that are not cow-related into separate PRs, so that we can sift what works and what needs to be reworked. Good work in any case, sorry this isn't as rewarding as we hoped just yet.

gballet · 2023-01-13T15:57:30Z

tree.go

-// Committer represents an object that is able to create the
-// commitment to a polynomial.
-type Committer interface {
-	CommitToPoly([]Fr, int) *Point
-}


yes, go-verkle used to support two schemes and one of them got dropped. I guess we can remove it, but in the interest of keeping a clean git history, I would appreciate if you could make this another PR.

gballet · 2023-01-13T16:20:24Z

tree.go

@@ -219,23 +216,6 @@ func NewLeafNode(stem []byte, values [][]byte) *LeafNode {
 		c2:     Generator(),
 	}

-	// Initialize the commitment with the extension tree


if you remove the commitment calculation, shouldn't the cow field be initialized with values at this point?

No, because commitment is nil in this case. In that case, the Commit() function already sees that as a signal that there's no previous commitment that we can do diff-updating; so it will do the usual full computation in the vector.

That means we don't need to initialize cow here. cow is initialized whenever we have a previous commitment calculation, and any value of the leaf is changed. In that case, we track in cow since will do a diff-update in the next Commit().

gballet · 2023-01-13T16:20:30Z

tree.go

+	// If we're at the root internal node, we fire goroutines exploiting
+	// available cores. Those goroutines will recursively take care of downstream
+	// layers, so we avoid creating too many goroutines.


The idea is interesting, although I still need to think it over to convince myself that it will have an impact. I doubt that will be the case for InsertOrdered, for example, since by the time a depth of 0 is reached, most of the tree will have already been flushed.

It is an idea that, I think, would be great to implemen in FlushAtDepth, btw.

It is however very interesting when computing the root during block execution.

So all in all, I'm in favor of this, but just as before, I would appreciate if this was done in a separate PR, so that we can check its contribution independently, and also keep a somewhat cleaner git history.

gballet · 2023-01-13T19:40:55Z

tree.go

+	// If we've never calculated a commitment for this leaf node, we calculate the commitment
+	// in a single shot considering all the values.
+	if leaf.commitment == nil {


could that not be simplified by simply setting the commitment to zero and then applying the CoW logic below to it? This should be equivalent, unless I missed something?

gballet · 2023-01-13T19:43:31Z

tree.go

+	// The length of the result is at a minimum the sum of:
+	// 1 = leafRLPType
+	// 31 = length of n.stem
+	// 32 = length of bitlist
+	// 2*32 = c1bytes and c2bytes
+	// And we add 4*32 bytes of extra capacity for an ~expected 4 children.
+	result := make([]byte, 1+31+32+2*32, 1+31+32+2*32+4*32)
+	result[0] = leafRLPType
+	copy(result[1:], n.stem[:31])
+	copy(result[1+31:], bitlist[:])
+	copy(result[1+31+32:], c1bytes[:])
+	copy(result[1+31+32+32:], c2bytes[:])
+	result = append(result, children...)
+	return result, nil


very nice, I also would like this to be a separate PR, though.

gballet · 2023-01-13T19:47:27Z

tree_test.go

+func TestMain(m *testing.M) {
+	_ = GetConfig()
+	os.Exit(m.Run())
+}


I don't think that is necessary: once the first test is called, it will initialize the cfg variable for all subsequent tests. Not sure what you are trying to achieve here?

jsign · 2023-01-13T20:09:15Z

I suggest breaking all the optimizations that are not cow-related into separate PRs, so that we can sift what works and what needs to be reworked. Good work in any case, sorry this isn't as rewarding as we hoped just yet.

Sounds good to me. I can break up changes in different PRs, and probably that will help.

Signed-off-by: Ignacio Hagopian <[email protected]>

jsign · 2023-03-01T17:35:42Z

@gballet, I still want to reconsider running this idea again after some other performance improvements we'll include soon.

Moving to draft to just avoid noise in the review PR order; moving to "paused mode" here for some weeks.

This was referenced Dec 13, 2022

Avoid forced heap escaping for polynomial commitment crate-crypto/go-ipa#31

Merged

VKT: Optimizations gballet/go-ethereum#147

Closed

jsign commented Dec 14, 2022

View reviewed changes

gballet reviewed Dec 16, 2022

View reviewed changes

jsign commented Dec 18, 2022

View reviewed changes

gballet force-pushed the master branch 8 times, most recently from 1d4d60c to a649f3c Compare January 9, 2023 09:44

jsign force-pushed the jsign/multiplefr branch 5 times, most recently from 0ecada1 to bf4de31 Compare January 11, 2023 20:06

jsign changed the title ~~Multiple performance improvements~~ COW in LeafNodes & batch Point->Frs & other improvements Jan 11, 2023

jsign force-pushed the jsign/multiplefr branch 3 times, most recently from 939ef88 to 3e8bae5 Compare January 12, 2023 14:22

jsign commented Jan 12, 2023

View reviewed changes

jsign marked this pull request as ready for review January 12, 2023 14:24

gballet reviewed Jan 13, 2023

View reviewed changes

implement cow on leaves

eca7921

Signed-off-by: Ignacio Hagopian <[email protected]>

jsign force-pushed the jsign/multiplefr branch from 3e8bae5 to eca7921 Compare January 19, 2023 21:41

jsign changed the title ~~COW in LeafNodes & batch Point->Frs & other improvements~~ COW in LeafNodes Jan 19, 2023

jsign marked this pull request as draft March 1, 2023 17:35

jsign added the on-hold label Mar 1, 2023

gballet mentioned this pull request Apr 21, 2023

overlay transition gballet/go-ethereum#198

Closed

3 tasks

This was referenced Apr 21, 2023

Include base -> overlay key-values migration logic gballet/go-ethereum#199

Merged

Batch NewLeaf node work in base API #349

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

COW in LeafNodes #314

COW in LeafNodes #314

jsign commented Dec 13, 2022 •

edited

Loading

gballet left a comment

jsign left a comment

jsign Jan 11, 2023

gballet Jan 13, 2023

jsign Jan 13, 2023

jsign Jan 11, 2023

jsign Jan 11, 2023

gballet Jan 13, 2023

jsign Jan 11, 2023

jsign Jan 11, 2023

jsign Jan 11, 2023

gballet Jan 13, 2023

jsign Jan 11, 2023

jsign Jan 11, 2023

jsign Jan 11, 2023

gballet Jan 13, 2023

jsign Jan 13, 2023 •

edited

Loading

jsign Jan 11, 2023

jsign Jan 11, 2023

gballet Jan 13, 2023

gballet left a comment

gballet Jan 13, 2023

gballet Jan 13, 2023

jsign Jan 13, 2023

gballet Jan 13, 2023

gballet Jan 13, 2023

gballet Jan 13, 2023

gballet Jan 13, 2023

jsign commented Jan 13, 2023

jsign commented Mar 1, 2023

COW in LeafNodes #314

Are you sure you want to change the base?

COW in LeafNodes #314

Conversation

jsign commented Dec 13, 2022 • edited Loading

gballet left a comment

Choose a reason for hiding this comment

jsign left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsign Jan 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gballet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsign commented Jan 13, 2023

jsign commented Mar 1, 2023

jsign commented Dec 13, 2022 •

edited

Loading

jsign Jan 13, 2023 •

edited

Loading